WAN QoS
The QoS configuration and application differs between high-speed interfaces on Catalyst switches and low-speed WAN on Cisco IOS routers or WAN modules of Catalyst switches. This section highlights a few QoS configurations and features that are applicable to low-speed serial interfaces. Specifically, this section introduces weighted fair queuing (WFQ) and low-latency queuing (LLQ).
Weighted Fair Queuing
Flow-based and class-based WFQ applies priority (or weights) to identified traffic to classify traffic into conversations and to determine how much bandwidth each conversation is allowed relative to other conversations. WFQ classifies traffic into different flows based on such characteristics as source and destination address, protocol, and port and socket of the session. WFQ is the default queuing mechanism for E1 and slower links.
Class-based WFQ (CBWFQ) extends the standard WFQ functionality to provide support for user-defined traffic classes. This enables you to specify the exact amount of bandwidth to be allocated for a specific class of traffic. Taking into account available bandwidth on the interface, you can configure up to 64 classes and control distribution among them.
Low-Latency Queuing
The distributed LLQ feature brings the ability to specify low-latency behavior for a traffic class. LLQ allows delay-sensitive data to be dequeued and sent first, before packets in other queues are dequeued, giving delay-sensitive data preferential treatment over other traffic.
The priority command is used to allow delay-sensitive data to be dequeued and sent first. LLQ enables use of a single priority queue within which individual classes of traffic are placed.
LLQ offers these features:
LLQ supports multiple traffic types over various Layer 2 technologies, including High-Level Data Link Control (HDLC), Point-to-Point Protocol (PPP), ATM, and Frame Relay.
All classes are policed to bandwidth to ensure that other traffic is serviced.
The rate limit is per class, even if multiple classes point traffic to a priority queue.
Oversubscription of bandwidth is not allowed for the priority class.
No WRED support is provided on priority classes. WRED is allowed only on bandwidth classes.
Bandwidth and priority are mutually exclusive.
The next two sections discuss two additional WAN QoS features, IP RTP Priority and link-efficiency mechanisms.
IP RTP Priority
The IP Real-Time Transport Protocol Priority (IP RTP Priority) feature provides a strict-priority queuing scheme that allows delay-sensitive data such as voice to be dequeued and sent before packets in other queues. This feature is similar to strict-priority queuing on Catalyst Ethernet interfaces but is applicable to low-speed serial or WAN interfaces on Cisco IOS routers. IP RTP Priority is especially useful on links whose speed is less than 1.544 Mbps.
Use this feature on serial interfaces or other similar WAN interfaces in conjunction with either WFQ or CBWFQ on the same outgoing interface. In either case, traffic matching the range of User Datagram Protocol (UDP) ports specified for the priority queue is guaranteed strict priority over other CBWFQ classes or WFQ flows; packets in the priority queue are always serviced first.
Voice traffic can be identified by its RTP port numbers and classified into a priority queue. The result of using this feature is that voice traffic is serviced as strict priority in preference to nonvoice traffic. Figure 10-16 illustrates the behavior of IP RTP priority (PQ stands for priority queuing in the figure).
Figure 10-16 IP RTP Priority
When configuring the priority queue with the IP RTP Priority feature, you specify a strict bandwidth limitation. This switch or router guarantees the amount of bandwidth to traffic queued in the priority queue. IP RTP Priority polices the flow every second. IP RTP Priority prohibits transmission of additional packets once the allocated bandwidth is consumed. If it discovered that the configured amount of bandwidth is exceeded, IP RTP Priority drops packets. The sum of all bandwidth allocation for voice and data flows on an interface cannot exceed 75 percent of the total available bandwidth. Bandwidth allocation takes into account the payload plus the IP, RTP, and UDP headers, but again, not the Layer 2 header. Allowing 25 percent bandwidth for other overhead is conservative and safe.
There are two basic commands for configuring IP RTP Priority:
ip rtp priority starting-rtp-port-number port-number-range bandwidth max-reserved-bandwidth percent
The ip rtp priority command specifies a starting RTP destination port number (starting-rtp-port-number) with a port range (port-number-range). The bandwidth option specifies the maximum reserved bandwidth. The percent option of the max-reserved-bandwidth command specifies the percent of bandwidth allocated for LLQ and IP RTP Priority. Example 10-12 illustrates a sample configuration of IP RTP Priority with a starting RTP port number of 16,384, a range of 16,383 UDP ports, a maximum bandwidth of 25 kbps, and a maximum bandwidth allocated between LLQ and IP RTP priority from the default (75 percent) to 80 percent.
Example 10-12 Sample IP RTP Configuration
Switch(config)#multilink virtual-template 1 Switch(config)#interface virtual-template 1 Switch(config-if)#ip address 172.16.1.1 255.255.255.0 Switch(config-if)#no ip directed-broadcast Switch(config-if)#ip rtp priority 16384 16383 25 Switch(config-if)#service-policy output policy1 Switch(config-if)#ppp multilink Switch(config-if)#ppp multilink fragment-delay 20 Switch(config-if)#ppp multilink interleave Switch(config-if)#max-reserved-bandwidth 80 Switch(config-if)#end
Link-Efficiency Mechanisms
Cisco IOS software offers the following three link-efficiency mechanisms that work in conjunction with queuing and traffic shaping to improve efficiency and predictability of the application service levels:
Payload compression
Header compression
Link fragmentation and interleaving (LFI)
These features are applicable to low-speed WAN interfaces and are emerging for use on high-speed Ethernet interfaces.
Payload Compression
Although many mechanisms exist for optimizing throughput and reducing delay in network traffic within the QoS portfolio, QoS does not create bandwidth. QoS optimizes the use of existing resources and enables the differentiation of traffic according to the operator policy. Payload compression does create additional bandwidth, because it squeezes packet payloads, and therefore increases the amount of data that can be sent through a transmission resource in a given time period. Payload compression is mostly performed on Layer 2 frames and, as a result, compresses the entire Layer 3 packet.
Note that IP Payload Compression Protocol (PCP) is a fairly new technique for compressing payloads on Layer 3, and can handle out-of-order data. As compression squeezes payloads, it both increases the perceived throughput and decreases perceived latency in transmission, because smaller packets with compressed payloads take less time to transmit than the larger, uncompressed packets.
Compression is a CPU-intensive task that may add per-packet delay due to the application of the compression method to each frame. The transmission (serialization) delay, however, is reduced, because the resulting frame is smaller. Depending on the complexity of the payload compression algorithm, overall latency might be reduced, especially on low-speed links. Cisco IOS supports three different compression algorithms used in Layer 2 compression:
STAC (or Stacker)
Microsoft Point-to-Point Compression (MPPC)
Predictor
These algorithms differ slightly in their compression efficiency, and in utilization of router resources. Catalyst switches support compression with specialized WAN modules or security modules.
Header Compression
All compression methods are based on eliminating redundancy when sending the same or similar data over a transmission medium. One piece of data, which is often repeated, is the protocol header. In a flow, the header information of packets in the same flow does not change much over the lifetime of that flow. Therefore, most header information is sent only at the beginning of the session, then stored in a dictionary, and later referenced by a short dictionary index.
The IETF standardized on two compression methods for use with IP protocols:
TCP header compression (also known as the Van Jacobson or VJ header compression)Used to compress the packet TCP headers over slow links, thus considerably improving the interactive application performance.
RTP header compressionUsed to compress UDP and RTP headers, thus lowering the delay for transporting real-time data, such as voice and video, over slower links.
It is important to note that network devices perform header compression on a link-by-link basis. Network devices do not support header compression across multiple routers, because routers need full Layer 3 header information to be able to route packets to the next hop.
Link Fragmentation and Interleaving
Link fragmentation and interleaving (LFI) is a Layer 2 technique in which all Layer 2 frames are broken into small, equal-size fragments, and transmitted over the link in an interleaved fashion. When fragmentation and interleaving are in effect, the network device fragments all frames waiting in the queuing system where it prioritizes smaller frames. Then, the network device sends the fragments over the link. Small frames may be scheduled behind larger frames in the WFQ system. LFI fragments all frames, which reduces the queuing delay of small frames because they are sent almost immediately. Link fragmentation reduces delay and jitter by normalizing packet sizes of larger packets in order to offer more regular transmission opportunities to the voice packets.
The following LFI mechanisms are implemented in Cisco IOS:
Multilink PPP with interleaving is by far the most common and widely used form of LFI.
FRF.11 Annex C LFI is used with Voice over Frame Relay (VoFR).
FRF.12 Frame Relay LFI is used with Frame Relay data connections.