High-Level Design Considerations
Considering the complexity of a majority of the networks out there today, they can be classified in a couple categories such as redundant and nonredundant. Typically, redundancy leads to increased complexity. Often, the simplest of networks do not plan for failures or outages and are commonly single-homed designs with multiple single points of failure. Networks can contain different aspects of redundancy. When speaking strictly of the campus LAN portion of the environment, it may include redundant links, controllers, switches, and access points. Table 1-1 lists some of the common techniques that are introduced when dealing with redundancy.
Table 1-1 Common Redundancy Techniques
Redundant Links |
Redundant Devices |
Administrative distance |
Redistribution |
Traffic engineering |
Loop prevention |
Preferred path selection |
Preferred path selection |
Prefix summarization |
Advanced filtering |
Filtering |
|
Many redundancy options are available, such as redundant links, redundant devices, EtherChannel, and so on. Having a visual of what some of these redundancy technologies look like is often helpful. One of these technologies is Cisco Virtual Switching System (VSS), which bonds switches together to look and act like a single switch. This helps put into context how the network will need to be configured and managed to support these types of redundancy options. The following are some of the benefits of VSS technology:
Simplifies operations
Boosts nonstop communication
Maximizes bandwidth utilization
Lowers latency
Redundancy can take many different forms. VSS is used for much more than just redundancy. It helps with certain scenarios in a campus design, such as removing the need for stretched VLANs and loops in the network. Figure 1-2 showcases an example of a campus environment before and after VSS and depicts the simplification of the topology.
Figure 1-2 VSS Device- and Link-Based Redundancy Options
Outside of the complexity associated with redundancy, there are many other aspects of the network that cause complexity within a network environment. Some of these aspects can include things such as securing the network to shield it from malicious behavior, leveraging network segmentation to keep traffic types separate for compliance or governance reasons, and even implementing QoS to ensure optimal application performance and increase users’ quality of experience. What further complicates the network is having to manually configure these options. The networks of today are too rigid and need to evolve. The industry is moving from the era of connectivity-centric network delivery models to an era of digital transformation. There is a shift required to transition to a digital transformation model. The shift is from hardware- and device-centric options to open, extensible, software-driven, programmable, and cloud-enabled solutions. Figure 1-3 depicts the transition in a simple summary. Relying more on automation to handle the day-to-day operational tasks and getting back time to focus on how to make the network provide value to the business is crucial to many organizations. This is delivered through policy-driven, automated, and self-optimizing capabilities. This provides closed-loop, automated service assurance that empowers network operations staff to transition from a reactive nature to a more proactive and predictive approach. Freeing up more of the operations staff’s time should enable them to focus on more strategic initiatives within the business.
Figure 1-3 Digital Transformation Transition
Intent-based networking (IBN) is taking the IT industry by storm. The concept revolves around signifying the intent of the business and automatically translating that intent into the appropriate corresponding networking tasks. This is a circular logic in that it captures the intent of the business and IT staff and then translates that intent into the appropriate policies that are required to support the business. Once the policies are created, the next step is to orchestrate the configuration of the infrastructure. This includes both physical and virtual components. This then kicks off the final step, which is providing assurance, insights, and visibility to ensure the network is functioning properly. Because this is a loop in a sense, the logic uses continuous verification and supplies any corrective actions that are necessary to fix or enhance the network’s performance. Figure 1-4 illustrates the intent-based networking model.
Figure 1-4 Intent-Based Networking
Analytics and insights are absolutely critical to networks of today. Typical network management systems (NMSs) do not provide the necessary information to resolve issues in a quick and efficient manner. They are reactive in nature and don’t supply the predictive monitoring and alerting that organizations require. Simple Network Management Protocol (SNMP) Traps and SYSLOG messages are valuable but haven’t been used as well as they could be. Reactive notifications mean that the issue or fault has already happened and don’t prevent any impact to the business. Often, there are false positives or so many alerts that it is difficult to determine what information should be acted upon or ignored completely. Traditionally, the network operations workflow has been similar to the following:
Receive an alert or helpdesk ticket.
Log in to the device(s) to determine what happened.
Spend time troubleshooting.
Resolve the issue.
The days are over of hunting around and searching through log files and debugging traffic to determine what the issue is that has caused an outage to the network. The amount of data that runs through these networks and has to be sorted through to chase down an issue is exponentially increasing. This is leading to the manual sifting through information to get to the root cause of an issue being extremely more difficult than ever before. Organizations rely on information relevant to what they are looking for; otherwise, the data is useless. For example, if a user couldn’t get on the wireless network last Tuesday at 3 p.m., and the logs are overwritten or filled with non-useful information, how does this help the network operations staff troubleshoot the issue at hand? It doesn’t. This wastes time, which is one of the most precious resources for network operations staff. The dichotomy of this is using analytics and insights to help direct network operators to the right place at the right time to take the right action. This is part of what Cisco DNA Assurance does as part of intent-based networking.
Problem isolation is much easier within an intent-based network because the entire network acts as a sensor that provides insights into the failures that are happening in the network. The network also has the capability to have a holistic view of the network from a client perspective. From a wireless perspective alone, this can provide information such as failure reasons, received signal strength indicator (RSSI), and onboarding information.
One of the most time-draining parts of the troubleshooting process is trying to replicate the issue. The previously mentioned issue of a user not being able to get on the network last Tuesday at 3 p.m. would be very difficult to replicate. How would anyone know what possibly was going on last Tuesday at 3 p.m.? In reality, the only traditional way to know what was going on from a wireless perspective was to have constant packet captures and spectrum analyzers running. Due to cost, space, and not knowing where the issue may arise, this is not a practical approach. What if instead there was a solution that could not only act as a DVR for the network but also use streaming telemetry information such as NetFlow, SNMP, and syslog and correlate the issues to notify the network operations staff of what the issue was, when it happened—Even if it happened in the past? Imagine the network providing all this information automatically. Additionally, instead of having Switched Port Analyzer (SPAN) ports configured across the campus with network sniffers plugged in everywhere in hopes of capturing the wireless traffic when there is an issue, imagine the wireless access points could detect the anomaly and automatically run a packet capture locally on the AP that would capture the issue. All these analytics could provide guided remediation steps on how to fix the issue without requiring anyone to chase down all the clues to solve the mystery. Fortunately, that solutions exists: Cisco DNA Assurance can integrate using open APIs to many helpdesk ticketing platforms such as ServiceNOW. The advantage of this is that when an issue happens in the network, Cisco DNA Assurance can automatically detect it and create a helpdesk ticket, add the details of the issue to the ticket as well as a link to the issue in Assurance, along with the guided remediation steps. That means when the on-call support engineer gets the call at 2 a.m., she already has the information on how to fix the issue. Soon, automatic remediation will be available, so the on-call person won’t have to wake up at 2 a.m. when the ticket comes in. This is the power of Assurance and intent-based networks.