Building the Policy Framework
To achieve a consistent end-to-end network QoS implementation, it is expected that all IP traffic follow a set of defined processes from the source to the destination device. As a strategy, the trust boundary should be established at the LAN edge closest to the connected devices, such as desktop/voice/server L2 switches or lab gateways. All IP traffic that arrives at the LAN edge switch should be classified as either trusted or untrusted.
Classification and Marking of Traffic
As the name implies, trusted devices, such as IP phones, call managers, and unity voice servers, already originate traffic with the desired ToS marked to the appropriate values. Hence, it is simply a matter of trusting and preserving the received ToS values when the packets are subsequently forwarded to the next switch or router in the network. On the other hand, not all traffic originating from devices that are user-/admin-configurable, such as desktop PCs and file/print/application servers, should be trusted with respect to their ToS settings (real-time desktop applications are covered in later sections). Therefore, the traffic from these devices needs to be re-marked with the appropriate ToS values that accurately reflect the desired level of priority for such traffic up to a predefined bandwidth limit.
As soon as the traffic is classified as trusted or is re-marked with the correct ToS settings at the LAN edge into one of the defined classes of service, a trusted edge boundary is established. This enables the traffic to be fully trusted within the network such that it can be prioritized and acted on accordingly at any of the potential congestion points. Typically, these congestion points are at the WAN edge; however, they can also be found at the LAN aggregations.
Trusted Edge
For traffic to be fully trusted within the network core, it is critical that all traffic class-ification and re-marking at the edge be performed before any frames are forwarded to the next switch or router. This means that the ingress traffic must be inspected to determine whether it is to be trusted or untrusted. If it is considered trusted, the L2 CoS, Layer 3 (L3) IPP, or DSCP values can be derived from the incoming frame/packet and subsequently forwarded to the next device unmodified. Untrusted traffic, on the other hand, should be rewritten to a default DSCP value.
Device Trust
To simplify the edge classification and re-marking operation, the concept of a trusted device needs to be defined. These are devices that are known to correctly provide QoS markings for traffic they originate. Furthermore, these devices also have limited or minimal user QoS configuration capability, such as IP phones, Call Manager/Unity/IPCC servers, and voice/videoconferencing gateways. Whenever these devices are connected to the L2 edge switch, it is easier to trust the L2 CoS or L3 DSCP information on the ingress port rather than to manually identify the type of traffic that should be trusted by means of access control lists (ACLs). For some traffic types, such as RTP streams, it is more difficult to match on specific Layer 4 (L4) ports because the application operates on dynamic port ranges. In such cases, where possible, it is preferable to allow the device or application to correctly classify the traffic and be trusted by the switch.
Application Trust
Although it is possible to establish a trust boundary using ingress CoS or ToS values from devices that are considered trusted, it is important to note that not all devices support the proper QoS marking. Hence, forwarding the traffic without first modifying the ToS value to the appropriate IPP or DSCP can potentially result in erroneous QoS treatment of the traffic for a particular CoS at the WAN edge. By passing all untrusted traffic through an ACL at the LAN edge, it is possible to correctly identify applications that cannot provide correct QoS marking based on the L3/L4 protocol information. Subsequently, these applications can be reclassified and marked to the appropriate classes of service. All other traffic that the ACL does not correctly identify should have its CoS and/or ToS values rewritten to the default/best-effort CoS.
An example of classification by ACL is to re-mark traffic originating Cisco Softphone RTP stream from workstations to Class 5 and associated Skinny (SCCP) packets to Class 3 and non-drop-sensitive batch transfer traffic to Class 1. All other traffic is rewritten to Class 0 regardless of how the original ToS values are set.
By enabling trust for specific real-time desktop applications, such as Cisco Softphone and videoconferencing, it is envisioned that a strategy for ingress traffic policing or rate limiting of traffic belonging to Classes 3, 4, and 5 also be applied at the LAN switch. This would ensure that each attached desktop machine does not exceed a predefined maximum band-width value for these priority classes of service. This does not eliminate the need for a comprehensive call admission control (CAC) implementation for voice and video. However, this is yet another level of protection against potential misuse of the QoS classes of service whether it is executed intentionally or unintentionally.
CoS and DSCP
At the L2 edge, Cisco IP phones can VLAN trunk to the Catalyst switches using 802.1Q tagging. Within the 802.1Q tagged frame is a 3-bit CoS field that is commonly referred to as the 802.1p bits. Coincidentally, this is equivalent to the 3 IPP bits within the L3 ToS field. Hence, to maintain end-to-end QoS, it is necessary to ensure that the CoS-to-IPP and IPP-to-CoS are consistently mapped throughout the network. Similarly, CoS values can also be mapped to DSCP to provide the same end-to-end QoS functionality; however, care must be taken to ensure that each CoS value is mapped to a DSCP range of values that has a common 3 MSBs.
By leveraging the CoS setting of the frame coming from the phone, strict priority ingress queuing is possible on Catalyst platforms that support a receive queue mechanism. Because the intelligence of the ingress application-specific integrated circuit (ASIC) on the switch is limited to L2 header inspection, ingress queuing based on L3 ToS values is not possible. For trusted devices that are not capable of 802.1Q trunking to the Ethernet switches, as well as ports configured for edge switch to router uplinks, it is necessary to trust the DSCP values of all incoming IP packets.
Strategy for Classifying Voice Bearer Traffic
Voice traffic traverses the network in the form of RTP streams. The Cisco IP phone originates RTP packets with a DSCP value of 46 (EF) and a CoS value of 5. Based on the device trust model, the DSCP value of voice packets should be preserved across the network. Because MPLS VPN is the WAN transport you are considering, it is common across many service providers to expect RTP traffic presented at the ingress of the provider edge (PE) device to be marked with a DSCP value of 46.
QoS on Backup WAN Connections
Today, WAN services are diverse. Depending on the geographic location of the sites, they may include technologies such as ATM, Frame Relay, ISDN, point-to-point time-division multiplexing (TDM), and network-based VPNs. However, a site is not always provisioned with equal-sized connections. Hence, if a backup connection exists, it is expected to be of a lower bandwidth than the primary link as well as being idle during normal operating conditions. This means that when the site's traffic is required to be routed over the backup connection, potential oversubscription of the link may occur.
To understand how QoS can be applied to back up WAN circuits, it is important to under-stand exactly how much is allocated for each CoS on the primary connection. However, due to the diverse nature of the types of site locations and sizes of WAN circuits implemented in today's environment, the overall amount of bandwidth required for real-time traffic can vary from one site to another. It is therefore recommended that, for any given fixed-line primary WAN link, no more than 33 percent of the total available bandwidth be assigned to traffic belonging to Class 5. This is consistent with the overall recommendations for provisioning LLQ, as discussed in earlier sections. This can also be overprovisioned.
Shaping/Policing Strategy
There are many cases in which a connection's guaranteed bandwidth is not necessarily the same as the physical port speed. Hence, anything that is transmitted in excess of the available bandwidth is subject to policing and potentially can be dropped by the service provider without regard for the traffic classes. WAN technologies such as Frame Relay, ATM, and L2 and L3 IP/VPN services are good examples of this, whereby it is possible to transmit up to the physical access port speed. However, the service provider guarantees delivery for traffic only up to the contracted bandwidth such as CIR or sustainable cell rate (SCR) in the case of ATM.
The decision of whether excess traffic should be dropped or marked down to a different class depending on the applicable WAN technology should be left to the discretion of the network administrator. Traffic shaping and policing at the WAN edge means that there is more control over the type of excess traffic that should be sent to the provider's network. This avoids the chance that service providers will indiscriminately discard excess traffic that belongs to all classes of service.
A better approach is to treat traffic with a different drop preference, as defined in RFC 2597, Assured Forwarding Drop Preference. The reason for this is that different queues are drained at different rates. Therefore, if you mark to a different class, you introduce out-of-sequence packet delivery, which has detrimental effects on the in-profile traffic. DSCP-based WRED is then employed to discard out-of-profile traffic aggressively ahead of in-profile traffic.
Real-time voice traffic is sensitive to delay and jitter. Therefore, it is recommended that whenever possible, excess burst traffic for this class should be policed. By default, real-time voice traffic that exceeds the bandwidth allocated to the strict priority queue (low-latency queuing [LLQ]) is allowed limited bursting.
When using network-based VPN services, such as MPLS VPN, and depending on the service provider offerings, each class can be allocated guaranteed bandwidth across the provider network. By shaping each class at the customer edge (CE), excess traffic could still be forwarded to the provider network but may be marked down to a different CoS or set to a higher drop preference value within the same class selector. This would ensure that if a service provider experiences network congestion, traffic is dropped based on the network administrator's preference rather than random discards by the provider.
In the case of L2/L3 VPN, it is also possible to have many sites connected to the same service provider network with varying connection speeds. Often, the hub site is serviced by a much larger connection while remote offices are connected at significantly reduced speed. This may cause an oversubscription of the remote WAN connection due to the peer-to-peer nature of L2/L3 VPNs. Egress shaping should be considered on the CE, particularly at the hub location, to prevent this situation from occurring. However, some level of oversub-scription of the remote link may still occur, due to the fact that remote-to-remote office traffic patterns are unpredictable and cannot be accounted for. For L2 VPN services that share a common broadcast domain, it is not recommended that these types of technology be adopted due to the difficulty inherent in egress traffic shaping.
Queuing/Link Efficiency Strategy
QoS mechanisms such as LLQ and CBWFQ address the prioritization of L3 traffic to the router's interface driver. However, the underlying hardware that places actual bits on the physical wire is made up of a single transmit ring buffer (TX ring) that operates in a FIFO fashion. The result is the introduction of a fixed delay (commonly called serialization delay) to the overall end-to-end transmission of a packet before it is encoded onto the wire. Depending on link speed and the packet's size, this serialization delay can be significant and can have a severe impact on voice quality.
Small voice packets that are processed via LLQ are still queued behind other packets in the TX ring buffer. In the worst-case scenario, the voice packet would have to wait for a 1500-byte packet to be transmitted first. Hence, serialization delay becomes a major factor in the overall end-to-end latency of the voice packet.
The size of the TX ring buffer represents a trade-off. If the buffer is too large, it is possible that too many data fragments may be placed in the queue before an LLQ fragment. This would result in the LLQ fragment's being delayed, causing higher latency and jitter. A buffer that's too small can keep higher-speed interfaces from failing to achieve line rate. For 2-Mbps circuits, the default depth is two fragments. This means that at the worst there is the potential for up to two low-priority data fragments to be on the TX ring when an LLQ fragment is ready to transmit.
Therefore, fragment sizes must be calculated to account for a transmit delay of up to two fragments. Based on the fragment-sized calculation, the worst-case jitter target for these low-speed links should be approximately 18 ms. MLP fragments in Cisco IOS have 11-byte L2 headers (with shared flags and long sequence numbers). Based on 11-byte overhead and fragment-size multiples of 32, the following calculation is used to derive the fragment size for MLP:
Then, you round down to multiples of 32, as shown in Table 5-1.
Table 5-1 Fragment Sizing
Line Rate (in kbps) |
Fragment Size (in Bytes) |
128 |
128 |
256 |
256 |
384 |
416 |
512 |
576 |
768 |
864 |
1024 |
1152 |
> 1024 |
No fragmentation |
The overhead of FRF.12 is similar and does not affect the fragment sizes used. As a result, LFI (via MLP or FRF.12) is required for link rates of 1024 kbps or less.
For the low-speed links defined in Table 5-2, LFI techniques such as MLP and Frame Relay FRF.12 can be used to reduce the effect of serialization delay on voice traffic in such an environment. LFI functions by fragmenting larger packets into smaller-sized fragments such that, for a given low-speed link, all packets can be transmitted onto the wire with a serialization delay that is acceptable to voice traffic. For low-speed ATM links, it is inadvisable to use MLP LFI over ATM virtual circuits because of the high overhead associated with encapsulation for small packets. Implementing low-speed ATM links is becoming an uncommon practice. However, if such scenarios exist and if voice is a requirement, it is recommended that the link speed be provisioned above 1024 kbps to avoid serialization delay and the need for fragmentation.
Table 5-2 LFI Fragment Sizing: Serialization Delay for Link Speeds of 1024 kbps and Below
Link Speed (in kbps) |
Packet Size (in Bytes) |
|
|
|
|
|
|
64 |
128 |
256 |
512 |
1024 |
1500 |
64 |
8 ms |
16 ms |
32 ms |
64 ms |
128 ms |
187 ms |
128 |
4 ms |
8 ms |
16 ms |
32 ms |
64 ms |
93 ms |
256 |
2 ms |
4 ms |
8 ms |
16 ms |
32 ms |
46 ms |
512 |
1 ms |
2 ms |
4 ms |
8 ms |
16 ms |
23 ms |
768 |
0.64 ms |
1.3 ms |
2.6 ms |
5.1 ms |
10.3 ms |
15 ms |
1024 |
0.4 ms |
0.98 ms |
2 ms |
3.9 ms |
7.8 ms |
11.7 ms |
For WAN services that do not support LFI, such as digital subscriber line (DSL) and cable technologies, it is recommended that manual override of TCP segment size values be configured on the connected router interface. This ensures that large TCP packets are fragmented to reduce the effect of serialization delay on low-speed broadband circuits.