Complete guide to network virtualization
There are a number of emerging and proposed standard protocols focused on optimizing the support that data center Ethernet LANs provide for server virtualization. Several of these protocols are aimed at network virtualization via the creation of multiple virtual Ethernet networks that can share a common physical infrastructure in a manner that is somewhat analogous to multiple virtual machines sharing a common physical server.
Most protocols for network virtualization are based on creating virtual network overlays using techniques based on encapsulation and tunneling. The most commonly discussed protocols include VXLAN, NVGRE, STT, and SPB MAC-in-MAC. SPB is already an IEEE standard, while it is likely that only one of the other proposals will achieve IETF standard status, most likely VXLAN.
Traditional network virtualization
The one-to-many virtualization of network entities is not a new concept. The most common examples are VLANs and Virtual Routing and Forwarding (VRF) instances.
VLANs partition the network into as many as 4,094 broadcast domains, as designated by a 12-bit VLAN ID tag in the Ethernet header. VLANs have been a convenient means of isolating different types of traffic that share the same switched LAN infrastructure.
In data centers that make extensive use of server virtualization, the limited number of VLANs can present problems, especially when large number of tenants need to be supported, each requiring multiple VLANs. Extending VLANs across the data center via 802.1Q trunks to support VM mobility adds operational cost and complexity. In data centers based on Layer 2 server-to-server connectivity, large numbers of VMs, each with its own media access control address, can also place a burden on the forwarding tables capacities of Layer 2 switches.
VRF is a form of Layer 3 network virtualization in which a physical router supports multiple virtual router instances, each running its own routing protocol instance and maintaining its own forwarding table.
Unlike VLANs, VRF does not use a tag in the packet header to designate the specific VRF to which a packet belongs. The appropriate VRF is derived at each hop based both on the incoming interface and on information in the frame. An additional requirement is that each intermediate router on the end-to-end path followed by a packet needs to be configured with a VRF instance that can forward that packet.
Network Virtualization with Overlays
Because of the shortcomings of the traditional VLAN or VRF models, a number of new techniques for creating virtual networks have recently emerged. Most are based on the use of encapsulation and tunneling to construct multiple virtual network topologies overlaid on a common physical network.
A virtual network can be a Layer 2 network or a Layer 3 network, while the physical network can be Layer 2, Layer 3 or a combination depending on the overlay technology. With overlays, the outer (encapsulating) header includes a field that is generally 24 bits wide that carries a virtual network instance ID (VNID) that specifies the virtual network designated to forward the packet.
Virtual network overlays can provide a wide range of benefits, including:
• Support for essentially unlimited numbers of virtual networks; for example the 24-bit header enables the creation of up to 16 million virtual networks.
• Decoupling of the virtual network topology, service category (L2 or L3) and addressing from those of the physical network. The decoupling avoids issues such as MAC table size in physical switches.
• Support for virtual machine mobility independent of the physical network. If a VM changes location, even to a new subnet, the switches at the edge of the overlay simply update their mapping tables to reflect the new location of the VM. The network for a new VM can be provisioned entirely at the edge of the network.
• Ability to manage overlapping IP addresses between multiple tenants.
• Support for multi-path forwarding within virtual networks
The main difference between the various overlay protocols lies in their encapsulation formats and the control plane functionality that allows ingress (encapsulating) devices to map a frame to the appropriate egress (decapsulating) device.
VXLAN
Virtual eXtensible LAN (VXLAN) virtualizes the network by creating a Layer 2 overlay on a Layer 3 network via MAC-in-UDP encapsulation. The VXLAN segment is a Layer 3 construct that replaces the VLAN as the mechanism that segments the data center LAN for VMs.
Therefore, a VM can only communicate or migrate within a VXLAN segment. The VXLAN segment has a 24-bit VXLAN Network identifier. VXLAN is transparent to the VM, which still communicates using MAC addresses. The VXLAN encapsulation is performed through a function known as the VXLAN Tunnel End Point (VTEP), typically provided by a hypervisor switch or a possibly a physical access switch.
The encapsulation allows Layer 2 communications with any end points that are within the same VXLAN segment, even if these end points are in a different IP subnet. This allows live migrations of VMs to transcend Layer 3 boundaries. Since MAC frames are encapsulated within IP packets, there is no need for the individual Layer 2 switches to learn MAC addresses.
This alleviates MAC table hardware capacity issues on these switches. Overlapping IP and MAC addresses are handled by the VXLAN ID, which acts as a qualifier/identifier for the specific VXLAN segment within which those addresses are valid. The VXLAN control solution uses flooding based on Any Source Multicast (ASM) to disseminate end system location information.
As noted, VXLAN uses a MAC-in-UDP encapsulation. One of the reasons for this is that modern Layer 3 devices parse the 5-tuple (including Layer 4 source and destination ports). While VXLAN uses a well-known destination UDP port, the source UDP port can be any value. As a result, a VTEP can spread all the flows from a single VM across many UDP source ports. This allows the intermediate Layer 3 switches to make efficient use of multi-pathing even in the case of multiple flows between only two VMs.
Where VXLAN nodes on a VXLAN overlay network need to communicate with nodes on a legacy (i.e., VLAN) portion of the network, a VXLAN gateway can be used to perform the required tunnel termination functions including encapsulation/decapsulation. The gateway functionality could be implemented in either hardware or software.
VXLAN is the subject of a IETF draft supported by VMware, Cisco, Arista Networks, Broadcom, Red Hat and Citrix. VXLAN is also supported by IBM. Pre-standard implementations in hypervisor vSwitches and physical switches are beginning to emerge.
NVGRE
Network Virtualization using Generic Router Encapsulation (NVGRE) uses the GRE tunneling protocol defined by RFC 2784 and RFC 2890. NVGRE is similar in most respects to VXLAN with two major exceptions. While GRE encapsulation is not new, most network devices do not parse GRE headers in hardware, which may lead to performance issues and issues with 5-tuple hashes for traffic distribution in multi-path data center LANs.
The other exception is that the current IETF NVGRE draft does not specify a solution for the control plane functionality described earlier in general description of Network Overlays, leaving that for a future draft or possibly as something to be addressed by SDN (Software Defined Networking) controllers.
Some of the sponsors of NVGRE (i.e., Microsoft and Emulex) expect that some of the performance issues can be addressed by intelligent network interface cards (NIC) that offload NVGRE endpoint processing from the hypervisor vSwitch. The intelligent NICs would also have APIs for integration with overlay controllers and hypervisor management systems. Emulex has also demonstrated intelligent NICs that offload VXLAN processing from the VMware Distributed Switches.
STT
Stateless Transport Tunneling (STT) is a third overlay technology for creating Layer 2 virtual networks over a Layer 2/Layer 3 physical network within the data center. Conceptually, there are a number of similarities between VXLAN and STT. The tunnel endpoints are typically provided by hypervisor vSwitches, the VNID is 24 bits wide, and the transport source header is manipulated to take advantage of multi-pathing.
STT encapsulation differs from NVGRE and VXLAN in two ways. First, it uses a stateless TCP-like header inside the IP header that allows tunnel endpoints within end systems to take advantage of TCP segmentation offload (TSO) capabilities of existing TCP offload engines (TOE) that reside on server NICs.
The benefits to the host include lower CPU utilization and higher utilization of 10Gigabit Ethernet access links. STT also allocates more header space to the per-packet metadata, which provides added flexibility for the virtual network control plane. With these features, STT is optimized for hypervisor vSwitches as the encapsulation/decapsulation tunnel endpoints.
The STT IETF draft sponsored by Nicira does not specify a control plane solution. However, the Nicira network virtualization solution includes OpenFlow-like hypervisor vSwitches and a control plane based on a centralized network virtualization controller that facilitates management of virtual networks.
Shortest Path Bridging MAC-in-MAC (SPBM)
IEEE 802.1aq SPBM uses IEEE 802.1ah MAC-in-MAC encapsulation and the IS-IS routing protocol to provide Layer 2 network virtualization via VLAN extension in addition to the loop-free equal cost multi-path Layer 2 forwarding functionality normally associated with SPB.
VLAN extension is enabled by the 24 bit Virtual Service Network (VSN) Instance Service IDs (I-SID) that are part of the outer MAC encapsulation. Unlike other network virtualization solutions, no changes are required in the hypervisor vSwitches or NICs and switching hardware already exists that supports IEEE 802.1ah MAC-in-MAC encapsulation. For SPBM, the control plane is provided by the IS-IS routing protocol.
SPBM can also be extended to support Layer 3 forwarding and Layer 3 virtualization as described in the IP/SPB IETF draft using IP encapsulated within the outer SPBM MAC. This draft specifies how SPBM nodes can perform Inter-ISID or inter-VLAN routing. In addition, IP/SPB also provides for Layer 3 VSNs by extending Virtual Routing and Forwarding (VRF) instances at the edge of the network across the SPBM network without requiring that the core switches also support VRF instances.
VLAN-extension VSNs and VRF-extension VSNs can run in parallel on the same SPB network to provide isolation of both Layer 2 and Layer 3 traffic for multi-tenant environments. With SPBM, all the core switches starting at the access or aggregation switches that define the SPBM boundary need to be SPBM-capable. SPBM hardware switches are currently available from Avaya and Alcatel-Lucent.
Alternative solutions
A discussion of network virtualization would not be complete without at least a mention of two Cisco protocols: Overlay Transport Virtualization (OTV) and Locator/ID Separation Protocol (LISP).
OTV is optimized for inter-data center VLAN extension over the WAN or Internet using MAC-in-IP encapsulation. It prevents flooding of unknown destinations across the WAN by advertising MAC address reachability using IS-IS routing protocol extensions.
LISP is an encapsulating IP-in-IP technology that allows end systems to keep their IP address (ID) even as they move to a different subnet within the network (Location). By using LISP VM-Mobility, IP endpoints such as VMs can be relocated anywhere regardless of their IP addresses while maintaining direct path routing of client traffic. LISP also supports multi-tenant environments with Layer 3 virtual networks created by mapping VRFs to LISP instance-IDs.
In addition, future versions of the OpenFlow protocol will undoubtedly support some standards-based overlay functionality. In the interim, OpenFlow can potentially provide another type of network virtualization by isolating network traffic based on segregating flows. One very simple way to do this is to isolate sets of MAC addresses without relying on VLANs by adding a filtering layer to the OpenFlow controller. This type of functionality is available in v0.85 of the Big Switch Networks Floodlight controller. In multi-tenant environments there is also the potential for the OpenFlow controller to support a separate controller instance for each tenant.
Summary
The IT industry is in a state of dramatic flux. One of the primary technology drivers of this flux is the ongoing adoption of virtualization that started with server virtualization and is just now impacting the network. There are many technologies and techniques that IT organizations can use to implement network virtualization. This includes two Cisco protocols: Overlay Transport Virtualization and the Locator/ID Separation Protocol. It also includes a number of emerging technologies based on encapsulation and tunneling; e.g., VXLAN. In addition, the interest that IT organizations have in SDN is starting to accelerate. Network virtualization is one of the primary use cases that are associated with SDN.
Metzler heads Ashton, Metzler Associates. He can be reached at [email protected]