Queuing
Sometimes referred to as congestion management, queuing mechanisms identify how traffic from multiple streams is sent out of an interface that is currently experiencing congestion. This section examines various approaches to queuing and emphasizes the queuing approaches configured via MQC.
Queuing Basics
When a device, such as a switch or a router, is receiving traffic faster than it can be transmitted, the device attempts to buffer the extra traffic until bandwidth is available. This buffering process is called queuing. You can use queuing mechanisms to influence in what order various traffic types are emptied from the queue.
Congestion occurs not just in the WAN but also in the LAN. Mismatched interface speeds, for example, could result in congestion on a high-speed LAN. Points in the network in which you have aggregated multiple connections can result in congestion. For example, perhaps multiple workstations connect to a switch at Fast Ethernet speeds (that is, 100 Mbps), and the workstations are simultaneously transmitting to a server that is also connected through Fast Ethernet. Such a scenario can result in traffic backing up in a queue.
Although Cisco supports multiple queuing mechanisms, these Quick Reference Sheets primarily focus on CB-WFQ and LLQ. However, legacy queuing mechanisms are addressed first and include the following types:
FIFO queuing—First-in first-out (FIFO) queuing is not truly performing QoS operations. As its name suggests, the first packet to come into the queue is the first packet sent out of the queue.
Priority Queuing (PQ)—This type of queuing places traffic into one of four queues. Each queue has a different level of priority, and higher-priority queues must be emptied before packets are emptied from lower-priority queues. This behavior can “starve out” lower- priority traffic.
Round robin queuing—This type of queuing places traffic into multiple queues, and packets are removed from these queues in a round-robin fashion, which avoids the protocol- starvation issue that PQ suffered from.
Weighted Round Robin (WRR) queuing—This type of queuing can place a weight on the various queues, to service a different number of bytes or packets from the queues during a round-robin cycle. Custom Queuing (CQ) is an example of a WRR queuing approach.
Deficit Round Robin (DRR) queuing—This type of queuing can suffer from a “deficit” issue. For example, if you configured CQ to removed 1500 bytes from a queue during each round-robin cycle, and you had a 1499-byte packet and a 1500-byte packet in the queue, both packets would be sent. This is because CQ cannot send a partial packet. Because the 1499-byte packet was transmitted and because another byte still had to be serviced, CQ would start servicing the 1500-byte packet. DRR keeps track of the number of extra bytes that are sent during a round and subtracts that number from the number of bytes that can be sent during the next round.
A router has two types of queues: a hardware queue and a software queue. The hardware queue, which is sometimes referred to as the transmit queue (TxQ), always uses FIFO queuing, and only when the hardware queue is full does the software queue handle packets. Therefore, your queuing configuration only takes effect during periods of interface congestion, when the hardware queue has overflowed. With this basic understanding of queuing, you begin to examine several queuing methods in more detail.
FIFO
Using FIFO in the software queue works just like FIFO in the hardware queue, where you are not truly performing packet manipulation. FIFO is the default queuing method on interfaces that run at speeds of greater than 2.048 Mbps.
Although FIFO is supported widely on all IOS platforms, it can starve out traffic by allowing bandwidth-hungry flows to take an unfair share of the bandwidth.
WFQ
Weighted Fair Queuing (WFQ) is enabled by default on slow-speed (that is, 2.048-Mbps and slower) interfaces. WFQ allocates a queue for each flow, for as many as 256 flows by default. WFQ uses IP Precedence values to provide a weighting to Fair Queuing (FQ). When emptying the queues, FQ does byte-by-byte scheduling. Specifically, FQ looks 1 byte deep into each queue to determine whether an entire packet can be sent. FQ then looks another byte deep into the queue to determine whether an entire packet can be sent. As a result, smaller traffic flows and smaller packet sizes have priority over bandwidth-hungry flows with large packets.
In the following example, three flows simultaneously arrive at a queue. Flow A has three packets, which are 128 bytes each. Flow B has a single 96-byte packet. Flow C has a single 70-byte packet. After 70 byte-by-byte rounds, FQ can transmit the packet from flow C. After an additional 26 rounds, FQ can transmit the packet from flow B. After an additional 32 rounds, FQ can transmit the first packet from flow A. Another 128 rounds are required to send the second packet from flow A. Finally, after a grand total of 384 rounds, the third packet from flow A is transmitted.
With WFQ, a packet’s IP Precedence influences the order in which that packet is emptied from a queue. Consider the previous scenario with the addition of IP Precedence markings. In this scenario, flow A’s packets are marked with an IP Precedence of 5, whereas flow B and flow C have default IP Precedence markings of 0. The order of packet servicing with WFQ is based on sequence numbers, where packets with the lowest sequence numbers are transmitted first.
The sequence number is the weight of the packet multiplied by the number of byte-by-byte rounds that must be completed to service the packet (just as in the FQ example). Cisco IOS calculates a packet’s weight differently depending on the IOS version. Prior to IOS 12.0(5)T, the formula for weight was as follows:
Weight = 4096 /(IP Prec. + 1)
In more recent versions of the IOS, the formula for weight is as follows:
Weight = 32384 /(IP Prec. + 1)
Using the pre-IOS 12.0(5)T formula, the sequence numbers are as follows:
A1 = 4096 /(5 + 1) * 128 = 87381
A2 = 4096 /(5 + 1) * 128 + 87381 = 174762
A3 = 4096 /(5 + 1) * 128 + 174762 = 262144
B1 = 4096 /(0 + 1) * 96 = 393216
C1 = 4096 /(0 + 1) * 70 = 286720
Therefore, after the weighting is applied, WFQ empties packets from the queue in the following order: A1, A2, A3, C1, B1. With only FQ, packets were emptied from the queue in the order C1, B1, A1, A2, A3.
Although WFQ has default settings, you can manipulate those settings with the following interface-configuration-mode command:
Router(config-if)#fair-queue [cdt [dynamic-queues [reservable-queues]]]
The cdt parameter identifies the Congestive Discard Threshold (CDT), which is the number of packets allowed in all WFQ queues before the router begins to drop packets that are attempting to enter the deepest queue (that is, the queue that currently has the most packets). The default CDT value is 64.
With WFQ, each flow is placed in its own queue, up to a maximum number of queues as defined by the dynamic-queues parameter. The default number of queues that is created dynamically (that is, dynamic-queues) is 256.
The reservable-queues parameter defines the number of queues that are made available to interface features such as RSVP. The default number of reservable queues is 0.
Although WFQ is easy to configure (for example, it is enabled by default on interfaces that run at or below 2.048 Mbps), and although WFQ is supported on all IOS versions, it has its limitations. Specifically, WFQ cannot guarantee a specific amount of bandwidth for an application. Also, if more than 256 flows exist, by default, more than one flow can be forced to share the same queue.
You can view statistics for WFQ with the show interface interface-identifier command. The output from this command not only verifies that WFQ is enabled on the specified interface, but it also shows such information as the current queue depth and the maximum number of queues allowed.
CB-WFQ
The WFQ mechanism made sure that no traffic was starved out. However, WFQ did not make a specific amount of bandwidth available for defined traffic types. You can, however, specify a minimum amount of bandwidth to make available for various traffic types using the CB-WFQ mechanism.
CB-WFQ is configured through the three-step MQC process. Using MQC, you can create up to 63 class-maps and assign a minimum amount of bandwidth for each one. Note that the reason you cannot create 64 class-maps is that the class-default class-map has already been created.
Traffic for each class-map goes into a separate queue. Therefore, one queue (for example, for CITRIX traffic) can be overflowing, while other queues are still accepting packets. Bandwidth allocations for various class-maps can be specified in one of three ways: bandwidth, percentage of bandwidth, and percentage of remaining bandwidth. The following paragraphs describe each of these allocations.
You can make a specific amount of bandwidth available for classified traffic. To allocate a bandwidth amount, use the following command, noting that the units of measure are in kbps:
Router(config-pmap-c)#bandwidth bandwidth
Instead of specifying an exact amount of bandwidth, you can specify a percentage of the interface bandwidth. For example, a policy-map could allocate 25 percent of an interface’s bandwidth. Then, that policy-map could be applied to, for example, a Fast Ethernet interface and also to a slower-speed serial interface. To allocate a percentage of the interface bandwidth, use the following command:
Router(config-pmap-c)#bandwidth percent percent
As an alternative to allocating a percentage of the total interface bandwidth, you can also allocate a percentage of the remaining bandwidth (that is, after other bandwidth allocations have already been made). To allocate a percentage of the remaining interface bandwidth, use the following command:
Router(config-pmap-c)#bandwidth remaining percent percent
By default, each queue that is used by CB-WFQ has a capacity of 64 packets. However, this limit is configurable with the following command:
Router(config-pmap-c)#queue-limit number_of_packets
Although CB-WFQ queues typically use FIFO for traffic within a particular queue, the class-default queue can be enabled for WFQ with the following command:
Router(config-pmap-c)#fair-queue [dynamic-queues]
As noted earlier, CB-WFQ is configured through MQC. Therefore, the standard MQC verification and troubleshooting commands, such as show policy-map interface interface-identifier, are applicable for CB-WFQ.
By default, only 75 percent of an interface’s bandwidth can be allocated. The remaining 25 percent is reserved for nonclassified or overhead traffic (for example, CDP, LMI, or routing protocols). This limitation can be overcome with the max-reserved-bandwidth percentage interface-configuration-mode command, where the percentage option is the percentage of an interface’s bandwidth that can be allocated.
CB-WFQ is therefore an attractive queuing mechanism, thanks to its MQC configuration style and its ability to assign a minimum bandwidth allocation. The only major drawback to CB-WFQ is its inability to give priority treatment to any traffic class. Fortunately, an enhancement to CB-WFQ, called Low Latency Queuing (LLQ), does support traffic prioritization.
LLQ
Low Latency Queuing (LLQ) is almost identical to CB-WFQ. However, with LLQ, you can instruct one or more class-maps to direct traffic into a priority queue. Realize that when you place packets in a priority queue, you are not only allocating a bandwidth amount for that traffic, but you also are policing (that is, limiting the available bandwidth for) that traffic. The policing option is necessary to prevent higher-priority traffic from starving out lower-priority traffic.
Note that if you tell multiple class-maps to give priority treatment to their packets, all priority packets go into the same queue. Therefore, priority traffic could suffer from having too many priority classes. Packets that are queued in the priority queue cannot be fragmented, which is a consideration for slower-speed links (that is, link speeds of less than 768 kbps). LLQ, based on all the listed benefits, is the Cisco preferred queuing method for latency-sensitive traffic, such as voice and video.
You can use either of the following commands to direct packets to the priority queue:
Router(config-pmap-c)#priority bandwidth
(Note that the bandwidth units of measure are in kbps.)
Router(config-pmap-c)#priority percent percent
(Note that the percent option references a percentage of the interface bandwidth.)
Consider the following LLQ example.
Router(config)#class-map SURFING Router(config-cmap)#match protocol http Router(config-cmap)#exit Router(config)#class-map VOICE Router(config-cmap)#match protocol rtp Router(config-cmap)#exit Router(config)#policy-map QOS_STUDY Router(config-pmap)#class SURFING Router(config-pmap-c)#bandwidth 128 Router(config-pmap-c)#exit Router(config-pmap)#class-map VOICE Router(config-pmap-c)#priority 256 Router(config-pmap-c)#exit Router(config-pmap)#exit Router(config)#interface serial 0/1 Router(config-if)#service-policy output QOS_STUDY
In this example, NBAR is being used to recognize http traffic, and that traffic is placed in the SURFING class. Note that NBAR is invoked with the following command:
Router(config-cmap)# match protocol
Voice packets are placed in the VOICE class. The QOS_STUDY policy-map gives 128 kbps of bandwidth to the http traffic while giving 256 kbps of priority bandwidth to voice traffic. Then the policy-map is applied outbound to interface serial 0/1.
Catalyst-Based Queuing
Some Cisco Catalyst switches also support their own queuing method, called Weighted Round Robin (WRR) queuing. For example, a Catalyst 2950 switch has four queues, and WRR can be configured to place frames with specific CoS markings into certain queues. (For example, CoS values 0 and 1 are placed in Queue 1.)
Weights can be assigned to the queues, influencing how much bandwidth the various markings receive. The queues are then serviced in a round-robin fashion. On the Catalyst 2950, queue number 4 can be designated as an “expedite” queue, which gives priority treatment to frames in that queue. Specifically, the expedite queue must be empty before any additional queues are serviced. This behavior can lead to protocol starvation.
On a Catalyst 2950, frames are queued based on their CoS values. The following command can be used to alter the default queue assignments:
Switch(config)#wrr-queue cos-map queue_number cos_value_1 cos_value_2 ? cos_value_n
For example, the following command would map CoS values of 0, 1, and 2 to queue number 1 on a Catalyst 2950:
Switch(config)#wrr-queue cos-map 1 0 1 2
The weight that is assigned to a queue specifies how many packets are emptied from a queue during each round-robin cycle, relative to other queues. You can configure queue weights with the following command:
Switch(config)#wrr-queue bandwidth weight_1 weight_2 weight_3 weight_4
Remember that queue number 4 on a Catalyst 2950 can be configured as an expedite queue (that is, a priority queue). To configure queue number 4 as an expedite queue, set its weight to 0.
Following is an example of a WRR configuration.
Switch(config)#interface gig 0/5 Switch(config-if)#wrr-queue bandwidth 1 2 3 4 Switch(config-if)#wrr cos-map 4 5
In this example, the wrr-queue command is assigning the weights 1, 2, 3, and 4 to the switch’s four queues. The first queue, with a weight of 1, for example, only gets one-third the bandwidth that is given to the third queue, which has a weight of 3. The wrr cos-map 4 5 command is instructing frames that are marked with a CoS of 5 to enter the fourth queue.
To verify how a Catalyst 2950 is mapping CoS values to DSCP values (or vice versa), use the following command:
Switch#show mls qos maps [cos-dscp | dscp-cos]
You can use the following command to view the weight that is assigned to each queue:
Switch#show wrr-queue bandwidth
Another useful WRR command, which shows how CoS values are being mapped to switch queues shows is as follows:
Switch#show wrr-queue cos-map
Finally, you can see the QoS configuration for an interface (for example, trust state and the interface’s default CoS value) with the following command:
Switch#show mls qos interface [interface-identifier]