Network Infrastructure: Virtualization Requirements Fuel Network Switch Design

Next generation datacenters are evolving quickly, driven by dramatic increases in network traffic keeping step with the global rise in content consumption. Bandwidth growth and the need for scalability appears to be unstoppable, yet network designs must find a way to manage these demand costs effectively. Cloud networking technologies - coupled with the steady growth of virtual machines (VMs) in network deployments - have responded with new and more advanced switch performance considerations. Wide deployment of 10/40/100GbE cloud-optimized switches ensures higher server utilization, and dynamic provisioning of IT infrastructure to meet fast-changing business objectives and network usage models.

At the same time, network topology, performance metrics, and feature requirements can differ dramatically in virtualized environments such as private and public cloud networks; this in turn impacts design and implementation of data center network switches. Balancing performance with cost-effective scalability is a necessity for today's network infrastructure design - and requires a high-level understanding of network performance as experienced by applications running in VMs, and application mobility through VM migration and clustered applications that demand high rack-to-rack networking performance.

Essentials of Standards-Based, VM-Aware Switching

VM environments increasingly require native OS-based server level performance, without the performance penalty incurred by virtualized servers running multiple VMs. To address this, and improve the performance and scalability of applications that run in VMs, several standards-based switching technologies such as SR-IOV (Single Root I/O Virtualization by PCI Special Interest Group) and VEPA/EVB (Virtual Ethernet Port Aggregator/Edge Virtual Bridging by IEEE 802.1) have emerged. The implications of these technologies in network switches can be broadly classified as the need for "VM switching," essential to ensuring adequate networking performance for VMs.

Effective VM switching means access layer switches in the data center network (e.g., top-of-rack switches, blade switches, or end-of-row switches) must support a large number of virtual switch ports (VSPs); these directly service VMs running on the servers that are connected to the access layer switch. Further, access layer switches must also support an adequate number of queues that can be allocated on a per-VSP basis, delivering per-VM quality of service (QoS) and traffic shaping. Ideally, thousands of queues would be implemented and available for dynamic allocation across VSPs on an as-needed basis.

VSPs in access layer switches must also support features including link aggregation, load balancing, traffic mirroring, and statistics counters as available for physical switch ports. Such features are essential to enabling VMs with the same level of reliability, performance, and monitoring as physical servers.

VM switching is currently implemented in Broadcom's Smart-NV (Network Virtualization) enabled switches, based on the StrataXGS architecture and engineered specifically to meet current feature and scale requirements of private and public cloud networks. Figure 1 illustrates how these VM switching requirements are deployed, with virtual switch ports supporting link aggregation, queuing, access control list (ACL), statistics, and mirroring services similar to how those services are readily available for physical ports.

Figure 1. Broadcom's SmartSwitch technologies were developed to ensure that network infrastructure design requirements can be implemented comprehensively, cost-effectively, and in volume scale.

Evolving Rack-to-Rack Traffic Patterns are Redefining Network Topologies

The characteristics of network traffic are also evolving with increased east-to-west traffic redefining ideal network topologies in today's data centers. Instead of flowing up and down through network tiers (access, distribution, core layers and back again), data is moving from server-to-server, server-to-storage and server rack-to-server rack. The significant growth of east-to-west traffic is a result of web, application, and database server applications running as VMs that may reside in any server in any rack, coupled with the increased use of clustered applications such as Hadoop. Most importantly, this trend is changing the inherent design of network topologies, from oversubscribed and tiered networks to fast, fat, and flat networks which require specific new features in network switches.

Optimal switch solutions must support live VM migration as part of this traffic pattern, allowing layer 2 (L2) networks to scale across pods or sites within the data center or even across data centers. Live VM migration is essential for increasing server utilization, improving disaster recovery, and meeting the overall IT goal of implementing a dynamic data center infrastructure. Live VM migration across servers, racks, or pods requires that they reside in the same L2 network segments, referred to as a flat network.

From Tiered to Fast, Fat, and Flat Networks

Tiered networks were designed for north to south traffic, incorporating physical static servers in a tiered topology suited to single-tenant data centers. Modern workloads, characterized by heavier east-west traffic, require efficiency and bandwidth - and are better served by fast, fat, and flat network designs rather than these tiered, hierarchical models of earlier generations. Fast, fat, and flat networks can be implemented using high-bandwidth, high-density, fixed-configuration aggregation and access layer switches that are connected in a spine-leaf model. Flattened servers spend less time waiting for data, driving up server utilization and creating a big jump in application or system throughput. Flat network topologies are ideal for VMs and mobility, enabling any application on any server, which is essential for multitenant, cloud-based data centers.

Fat implies modular horizontal scaling and full cross-sectional bandwidth across racks of servers. For example, nonblocking or low oversubscription ratios (between downlink and uplink ports) can be supported through the very high bandwidth and high port density available in Smart-NV enabled Broadcom switch solutions. Nonblocking network switches have higher demands on cost and power per gigabit of bandwidth, but when complemented with multipathing of the links, full cross-sectional bandwidth or a fat network topology can be achieved (Fig. 2).

Figure 2. Increasing rack-to-rack traffic patterns are changing inherent design of network topologies. Fast, fat and flat data center networks are required, and are readily enabled by Smart-NV technology, coupled with Broadcom's industry-leading high density, low latency, and 10/40/100GbE line rate performance.

Defining Flat in L2 and L3 Implementations

Flat is used synonymously with two distinct network design approaches. Sometimes interpreted as an L2 network (either physical or virtual), flat topologies span across multiple pods or sites within a data center or even across data centers. TRILL (Transparent Interconnection of Lots of Links) or SPB (Shortest Path Bridging) technologies can be deployed to create scalable and large flat physical L2 networks with no constraints related to multipathing. When the underlying network is L3, a flat virtual L2 network can be achieved using L2oL3 (layer 2 over layer 3) network virtualization technologies, such as VXLAN (virtual extended LAN), NVGRE (Network Virtualization using GRE), or other similar overlay network technologies.

Occasionally flat is also interpreted as a single-tier or a two-tier network, rather than the traditional three-tier network, consisting of access, aggregation, and core layers. The various L2- and L3-based "network flattening" technologies described above can be used to build single-tier or two-tier networks. Further, management of such flat networks is greatly simplified when multiple access and aggregation switches in the network appear as a single switch. Technologies such as VN-tag, IEEE 802.1Qbr, or HiGig, integrated into sophisticated data center switches that employ Smart-NV technologies, enable a holistic data plane and control plane solution for flat networks.

Enabling Network Virtualization at Cloud-Scale

L2oL3 overlay technologies further extend the benefits of fast, fat, and flat architectures, enabling network virtualization at cloud-scale and providing an essential methodology for combining L2 and L3 network topologies. For example, mega-scale or Internet-scale data centers commonly rely on inexpensive L3 technologies for physical network infrastructure. This follows the success and scalability of the Internet, built on IP and a scalable hierarchical addressing scheme rather than flat L2 addressing. By implementing L2oL3 technologies, L3 networks have a natural way to create a flat L2 network - readily managing VM migration within and across data centers, as well as scale and multitenancy. Options include L2GRE (Layer 2 over Generic Routing Encapsulation), VXLAN (Virtual Extended LAN), or NVGRE (Network Virtualization using Generic Routing Encapsulation).

These technologies are essential in supporting the unique requirements of cloud-scale networks - which must extend the scale of virtual LANs, and provide VM scale, network partitioning, and hybrid cloud enablement for multitenancy support. Perhaps most importantly, these highly scalable networks must allow efficient VM-based workload placement through live VM migration across pods or sites in a single data center or across data centers.

Smart Switching Maximizes L2oL3 Performance

In turn, in order to effectively enable network virtualization at cloud-scale, today's high-performance switch solutions must support new and innovative L2oL3 overlay network technologies such as L2GRE, VXLAN, and NVGRE. While L2GRE is implemented in the open source Open VSwitch (OVS) initiative, VXLAN and NVGRE are industry collaborations initiated by VMware and Microsoft, respectively, along with their partners. Although the packet format and control plane implementations for the three technologies may differ, their impact on Ethernet switches can be grouped together into a common set of requirements referred to as L2oL3 overlay technologies (Fig. 3).

Figure 3. This generic frame format represents three L2oL3 technologies, including VXLAN, NVGRE and L2GRE header and tunnel ID.

L2oL3-based network virtualization technologies not only eliminate the VLAN-based scaling challenges (limited up to 4K VLAN IDs) that exacerbate scaling in multitenant networks; they also promise to detach network virtualization-related configuration from physical switches. This enables software-defined networks across multivendor equipment - an optimal characteristic for both public and complex, hybrid cloud deployments.

L2oL3 overlay techniques are well-suited to large-scale data centers that rely on L3's proven scalability of L3 addressing and multipathing technologies. Physical L3 networks are implemented with multiple overlays or tunnels, each of them a virtual L2 network that can then be assigned to a tenant. In turn, virtual L2 networks can enable live VM migration across pods, sites, or data centers that are connected using L3 networks. L2oL3 technologies - and the sophisticated network switches that support their inherent range of control plane features - provide substantial flexibility in network design. Depending on the use case, gateways may be placed closer to legacy equipment, or elsewhere in the network; flat virtual L2 segments on a per-tenant basis can be created at VM-level granularity. Smart-NV enabled switches further allow network operators to monitor and analyze traffic on a per-tenant basis. This type of flexible deployment is essential, and offers multiple bandwidth and port density configurations for ideal placement in any location of the data center network.

Cloud Scale Network Virtualization Moving Forward

Legacy considerations and restrictions on usage models may vary dramatically between public and private clouds, however a fast, fat and flat topology has value in both environments. Innovative and flexible L2oL3 implementations are enabling network virtualization in each type of environment, creating the flat network designs that address efficiency and scalability for both service providers and cloud users.

VM-aware switching, increased east-to-west traffic patterns, and the rise of fast, flat and fat networks are essential considerations in these cloud-scale network deployments. Sophisticated switching technologies must support these needs - effectively enabling the multiple virtualization technologies required by enterprise private clouds, as well as the multitenancy, scale and cost-effectiveness integral to public cloud deployments. Next generation switching meets infrastructure virtualization requirements today - offering flexible, comprehensive performance and scale for current and next-generation private, public, and hybrid cloud networks.