
Transcription
White paperCisco publicCisco Application CentricInfrastructure Design Guide 2022 Cisco and/or its affiliates. All rights reserved.Page 1 of 218
IntroductionCisco Application Centric Infrastructure (Cisco ACI ) technology enables you to integrate virtual and physicalworkloads in a programmable, multihypervisor fabric to build a multiservice or cloud data center. The Cisco ACIfabric consists of discrete components connected in a spine and leaf switch topology that it is provisioned andmanaged as a single entity.This document describes how to implement a fabric such as the one depicted in Figure 1.The design described in this document is based on the following reference topology: Two spine switches interconnected to several leaf switches Top-of-Rack (ToR) leaf switches for server connectivity, with a mix of front-panel port speeds:1/10/25/40/50/100/200/400-Gbps Physical and virtualized servers dual-connected to the leaf switches A pair of border leaf switches connected to the rest of the network with a configuration that Cisco ACIcalls a Layer 3 Outside (L3Out) connection A cluster of three Cisco Application Policy Infrastructure Controllers (APICs) dual-attached to a pair ofleaf switches in the fabricFigure 1 Cisco ACI FabricThe network fabric in this design provides the following main services: Connectivity for physical and virtual workloads Partitioning of the fabric into multiple tenants, which may represent departments or hosted customers The ability to create shared-services partitions (tenant) to host servers or virtual machines whosecomputing workloads provide infrastructure services such as Network File System (NFS) and MicrosoftActive Directory to the other tenants Capability to provide dedicated or shared Layer 3 routed connections to the tenants present in thefabric 2022 Cisco and/or its affiliates. All rights reserved.Page 2 of 218
Note: The documentation set for this product strives to use bias-free language. For the purposes of thisdocumentation set, bias-free is defined as language that does not imply discrimination based on age, disability,gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality.Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of theproduct software, language used based on RFP documentation, or language that is used by a referenced thirdparty product.Components and versionsA Cisco ACI fabric can be built using a variety of Layer 3 switches that, while compatible with each other, differin terms of form factors and ASICs to address multiple requirements. The choice depends, among others, onthe following criteria: Type of physical layer and speed required Amount of Ternary Content-Addressable Memory (TCAM) space required Analytics support Multicast routing in the overlay Support for link-layer encryption Fibre Channel over Ethernet (FCoE) supportYou can find the list of available leaf and spine switches at the following lCisco ACI software releases can be long-lived releases or short-lived releases: Long-lived releases: These releases that have been undergoing frequent maintenance to help ensurequality and stability. Long-lived releases are recommended for the deployment of widely adoptedfunctions or for networks that will not be upgraded frequently Short-lived releases: These releases typically introduce new hardware or software innovations. Shortlived releases are recommended for deployment if the adoption of new hardware or of softwareinnovations is of interest. As a best practice, short-lived software releases should be upgraded to thenext available long-lived release for stability and longer maintenance benefits.At the time of this writing, Cisco ACI 4.2(7f) is considered the latest long-lived release. This document is basedon features that may be present in releases later than Cisco ACI 4.2(7f) up to the currently available release,which is Cisco ACI release 5.1(3e). The majority of what is recommended in this design document is applicableto Cisco ACI fabrics running Cisco ACI release 4.2(7f) or later with or without Virtual Machine Managerintegration unless explicitly indicated.Cisco ACI can integrate with every virtualized server using physical domains and the EPG Static Portconfiguration for "static binding" (more on this later) and with many external controllers using direct APIintegration, which is called Virtual Machine Manager (VMM) integration. Cisco APIC can integrate using VMMintegration with VMware ESXi hosts with VMware vSphere, Hyper-V servers with Microsoft SCVMM, RedHatVirtualization, Kubernetes, OpenStack, OpenShift, and more. Cisco ACI 5.1(1) and later releases can integratewith VMware NSX-T Data Center (NSX).The integration using static binding doesn’t require any special software version, whereas for the integrationusing Virtual Machine Manager you need specific Cisco ACI versions to integrate with specific Virtual MachineManager versions. 2022 Cisco and/or its affiliates. All rights reserved.Page 3 of 218
VMware ESXi hosts with VMware vSphere 7.0 can be integrated with Cisco ACI release 4.2(4o) or later usingVMM.VMware ESXi hosts can integrate with Cisco ACI either using the VMware vSphere Distributed Switch (vDS) orusing the Cisco Application Virtual Switch (AVS) and Cisco ACI Virtual Edge. Between Cisco ACI 4.2 and CiscoACI 5.1, there have been some changes with regard to the integration options with VMware ESXi hosts. Startingwith Cisco ACI release 5.0(1), AVS is no longer supported. Starting with Cisco ACI 5.1(1), Cisco APIC canintegrate with VMware NSX-T as a VMM domain.Note:This design guide explains design considerations related to teaming with specific reference tothe VMM integration with VMware vSphere and it does not include the integration with Cisco ACI VirtualEdge, nor with VMware NSX-T.For information about the support for virtualization products with Cisco ACI, see the ACI VirtualizationCompatibility trix.htmlFor more information about integrating virtualization products with Cisco ACI, see the virtualizationdocumentation on the following html#Virtualization — Configuration GuidesCisco ACI building blocksCisco Nexus 9000 series hardwareThis section provides some clarification about the naming conventions used for the leaf and spine switchesreferred to in this document: N9K-C93xx refers to the Cisco ACI leaf switches N9K-C95xx refers to the Cisco modular chassis N9K-X97xx refers to the Cisco ACI spine switch line cardsThe trailing -E and -X signify the following: -E: Enhanced. This refers to the ability of the switch to classify traffic into endpoint groups (EPGs)based on the source IP address of the incoming traffic. -X: Analytics. This refers to the ability of the hardware to support analytics functions. The hardware thatsupports analytics includes other enhancements in the policy CAM, in the buffering capabilities, and inthe ability to classify traffic to EPGs. -F: Support for MAC security. -G: Support for 400 Gigabit Ethernet.For simplicity, this document refers to any switch without a suffix or with the -X suffix as a first generationswitch, and any switch with -EX, -FX, -GX, or any later suffix as a second generation switch. 2022 Cisco and/or its affiliates. All rights reserved.Page 4 of 218
Note:The Cisco ACI leaf switches with names ending in -GX have hardware that is capable ofoperating as either a spine or leaf switch. The software support for either option comes in differentreleases. For more information, see the following heet-c78741560.htmlFor port speeds, the naming conventions are as follows: G: 100M/1G P: 1/10-Gbps Enhanced Small Form-Factor Pluggable (SFP ) T: 100-Mbps, 1-Gbps, and 10GBASE-T copper Y: 10/25-Gbps SFP Q: 40-Gbps Quad SFP (QSFP ) L: 50-Gbps QSFP28 C: 100-Gbps QSFP28 D: 400-Gbps QSFP-DDYou can find the updated taxonomy on the following s/datacenter/nexus9000/hw/n9k taxonomy.htmlFor more information about Cisco Nexus 400 Gigabit Ethernet switches hardware (which includes Cisco ACI leafand spine switches switches), go to the following link: tml# productsLeaf switchesIn Cisco ACI, all workloads connect to leaf switches. The leaf switches used in a Cisco ACI fabric are Top-ofthe-Rack (ToR) switches. A number of leaf switch choices differ based on function: Port speed and medium type Buffering and queue management: All leaf switches in Cisco ACI provide advanced capabilities to loadbalance traffic more precisely, including dynamic packet prioritization, to prioritize short-lived, latencysensitive flows (sometimes referred to as mouse flows) over long-lived, bandwidth-intensive flows(also called elephant flows). The newest hardware also introduces more sophisticated ways to keeptrack and measure elephant and mouse flows and prioritize them, as well as more efficient ways tohandle buffers. Policy CAM size and handling: The policy CAM is the hardware resource that allows filtering of trafficbetween EPGs. It is a TCAM resource in which Access Control Lists (ACLs) are expressed in terms ofwhich EPG (security zone) can talk to which EPG (security zone). The policy CAM size varies dependingon the hardware. The way in which the policy CAM handles Layer 4 operations and bidirectionalcontracts also varies depending on the hardware. -FX and -GX leaf switches offer more capacitycompared with -EX and -FX2. Multicast routing support in the overlay: A Cisco ACI fabric can perform multicast routing for tenanttraffic (multicast routing in the overlay). 2022 Cisco and/or its affiliates. All rights reserved.Page 5 of 218
Support for analytics: The newest leaf switches and spine switch line cards provide flow measurementcapabilities for the purposes of analytics and application dependency mappings. Support for link-level encryption: The newest leaf switches and spine switch line cards provide linerate MAC security (MACsec) encryption. Scale for endpoints: One of the major features of Cisco ACI is the endpoint database, which maintainsthe information about which endpoint is mapped to which Virtual Extensible LAN (VXLAN) tunnelendpoint (VTEP), in which bridge domain, and so on. Fibre Channel (FC) and Fibre Channel over Ethernet (FCoE): Depending on the leaf model, you canattach FC and/or FCoE-capable endpoints and use the leaf switch as an FCoE NPV device. Support for Layer 4 to Layer 7 service redirect: The Layer 4 to Layer 7 service graph is a feature thathas been available since the first release of Cisco ACI, and it works on all leaf switches. The Layer 4 toLayer 7 service graph redirect option allows redirection of traffic to Layer 4 to Layer 7 devices basedon protocols. Microsegmentation, or EPG classification capabilities: Microsegmentation refers to the capability toisolate traffic within an EPG (a function similar or equivalent to the private VLAN function) and tosegment traffic based on virtual machine properties, IP address, MAC address, and so on. Ability to change the allocation of hardware resources, such as to support more Longest Prefix Matchentries, or more policy CAM entries, or more IPv4 entries. This concept is called "tile profiles," and itwas introduced in Cisco ACI 3.0. For more information, see the following tches/datacenter/aci/apic/sw/kb/b Cisco APIC Forwarding Scale Profile Policy.pdf. You may also want to read the Verified Scalability .html#Verified Scalability Guides.For more information about the differences between the Cisco Nexus 9000 series switches, see the followingdocuments: 38259.html e switchesThe spine switches are available in several form factors both for modular switches as well as for fixed formfactors. Cisco ACI leaf switches with name ending in -GX have hardware that can operate both as spine and asleaf. At the time of this writing some -GX leaf switches can only be installed with the Cisco ACI leaf switchsoftware and some can only be installed with the spine switch software.The differences among spine switches with different hardware are as follows: Port speeds Support for analytics: although this capability is primarily a leaf switch function and it may not benecessary in the spine switch, in the future there may be features that use this capability in the spineswitch.Support for link-level encryption and for tches/datacenter/aci/aci multi- 2022 Cisco and/or its affiliates. All rights reserved.Page 6 of 218
ion-Guide-201 chapter 011.html#id 79312. Support for Cisco ACI Multi-Pod and Cisco ACI Multi-Site: For more information, refer to the specificdocumentation on Cisco ACI Multi-Pod and Cisco ACI Multi-Site, including the respective releasenotes.For information about Cisco ACI Multi-Site hardware requirements, see the following tches/datacenter/aci/aci lti-Site-Hardware-Requirements-Guide-201.htmlFor more information about the differences between the Cisco Nexus 9500 platform module line cards, refer tothe following -c78732088.htmlThe Cisco ACI fabric forwards traffic based on host lookups (when doing routing): all known endpoints in thefabric are programmed in the spine switches. The endpoints saved in the leaf switch forwarding table are onlythose that are used by the leaf switch in question, thus preserving hardware resources at the leaf switch. As aconsequence, the overall scale of the fabric can be much higher than the individual scale of a single leaf switch.The spine switch models also differ in the number of endpoints that can be stored in the spine proxy table,which depends on the type and number of fabric modules installed.You should use the verified scalability limits for the latest Cisco ACI release and see how many endpoints canbe used per e.html#Verified Scalability GuidesAccording to the verified scalability limits, the following spine switch configurations have the indicated endpointscalabilities: Max. 450,000 Proxy Database Entries with four (4) fabric line cards Max. 180,000 Proxy Database Entries with the fixed spine switchesThe above numbers represent the sum of the number of MAC, IPv4, and IPv6 addresses; for instance, in thecase of a Cisco ACI fabric with fixed spine switches, this translates into: 180,000 MAC-only EPs (each EP with one MAC only) 90,000 IPv4 EPs (each EP with one MAC and one IPv4) 60,000 dual-stack EPs (each EP with one MAC, one IPv4, and one IPv6)The number of supported endpoints is a combination of the capacity of the hardware tables, what the softwareallows you to configure, and what has been tested.See the Verified Scalability Guide for a given release and to the Capacity Dashboard in the Cisco APIC GUI forthis information. 2022 Cisco and/or its affiliates. All rights reserved.Page 7 of 218
CablingDetailed guidelines about which type of transceivers and cables you should use is outside of the scope of thisdocument. The Transceiver Compatibility Matrix is a great tool to help with this task:https://tmgmatrix.cisco.com/Cisco Application Policy Infrastructure Controller (APIC)The Cisco APIC is the point of configuration for policies and the place where statistics are archived andprocessed to provide visibility, telemetry, and application health information and enable overall management ofthe fabric. The controller is a physical appliance based on a Cisco UCS rack server with two interfaces forconnectivity to the leaf switches. The Cisco APIC is also equipped with Gigabit Ethernet interfaces for out-ofband management.For more information about the Cisco APIC models, see the following 715.htmlNote:A cluster may contain a mix of different Cisco APIC models: however, the scalability will bethat of the least powerful cluster member. The naming of the Cisco APICs, such as M3 or L3, isindependent of the UCS series names.Fabric with mixed hardware or softwareFabric with different spine switch typesIn Cisco ACI, you can mix new and old generations of hardware for the spine and leaf switches. For instance,you could have first-generation hardware leaf switches and new-generation hardware spine switches, or viceversa. The main considerations with spine hardware are as follows: Uplink bandwidth between leaf and spine switches Scalability of the spine proxy table (which depends primarily on the type of fabric line card that is usedin the spine) Cisco ACI Multi-Site requires spine switches based on the Cisco Nexus 9500 platform cloud-scale linecards to connect to the intersite networkYou can mix spine switche switches of different types, but the total number of endpoints that the fabricsupports is the minimum common denominator.Fabric with different leaf switch typesWhen mixing leaf switches of different hardware types in the same fabric, you may have varying support offeatures and different levels of scalability.In Cisco ACI, the processing intelligence resides primarily on the leaf switches, so the choice of leaf switchhardware determines which features may be used (for example, multicast routing in the overlay, or FCoE). Notall leaf switches provide the same hardware capabilities to implement all features.As an example, classification features such as IP address-based EPG, copy service, service-based redirect,FCoE, and potentially microsegmentation (depending on whether or not you use a software switch that supportsthe OpFlex protocol) or Layer 3 multicast are not equally available on all leaf switches. 2022 Cisco and/or its affiliates. All rights reserved.Page 8 of 218
Cisco APIC pushes the managed object to the leaf switches regardless of the ASIC that is present. If a leafdoes not support a given feature, it raises a fault. For multicast routing you should ensure that the bridgedomains and Virtual Routing and Forwarding (VRF) instances configured with the feature are deployed only onthe leaf switches that support the feature.Fabric with different software versionsThe Cisco ACI fabric is designed to operate with the same software version on all the APICs and switches.During upgrades, there may be different versions of the OS running in the same fabric.If the leaf switches are running different software versions, the following behavior applies: Cisco APIC pushesfeatures based on what is implemented in its software version. If the leaf switch is running an older version ofsoftware and the Cisco APIC does not understand a feature, the Cisco APIC will reject the feature; however, theCisco APIC may not raise a fault.For more information about which configurations are allowed with a mixed OS version in the fabric, refer to thefollowing tml#Software and Firmware Installation and Upgrade GuidesRunning a Cisco ACI fabric with different software versions is meant to be just a temporary configuration tofacilitate upgrades, and minimal or no configuration changes should be performed while the fabric runs withmixed OS versions.Fabric extenders (FEX)You can connect fabric extenders (FEXes) to the Cisco ACI leaf switches; the main purpose of doing so shouldbe to simplify migration from an existing network with fabric extenders. If the main requirement for the use ofFEX is the Fast Ethernet port speeds, you may want to consider also the Cisco ACI leaf switch models CiscoNexus N9K-C9348GC-FXP, N9K-C93108TC-FX, N9K-C93108TC-FX-24, N9K-C93108TC-EX, N9KC93108TC-EX-24, and N9K-C93216TC-FX2.A FEX can be connected to Cisco ACI with what is known as a straight-through topology, and vPCs can beconfigured between hosts and the FEX, but not between the FEX and Cisco ACI leaf switches.A FEX can be connected to leaf switch front-panel ports as well as converted downlinks (since Cisco ACIrelease 3.1).A FEX has many limitations compared to attaching servers and network devices directly to a leaf switch. Themain limitations as follows: No support for L3Out on a FEX No Rate limiters support on a FEX No Traffic Storm Control on a FEX No Port Security support on a FEX A FEX should not be used to connect routers or Layer 4 to Layer 7 devices with service graph redirect The use in conjunction with microsegmentation works, but if microsegmentation is used, then Quality ofService (QoS) does not work on FEX ports because all microsegmented traffic is tagged with a specific 2022 Cisco and/or its affiliates. All rights reserved.Page 9 of 218
class of service. Microsegmentation and a FEX is a feature that at the time of this writing has not beenextensively validated.Support for FCoE on a FEX was added in Cisco ACI release /datacenter/aci/apic/sw/1-x/release/notes/apic rn 221.htmlWhen using Cisco ACI with a FEX, you want to verify the verified scalability limits; in particular, the limits relatedto the number of ports multiplied by the number of VLANs configured on the ports (commonly referred to as html#Verified Scalability GuidesWith regard to scalability, you should keep in mind the following points: The total scale for VRFs, bridge domains (BDs), endpoints, and so on is the same whether you areusing FEX attached to a leaf or whether you are connecting endpoints directly to a leaf. This meansthat, when using FEX, the amount of hardware resources that the leaf provides is divided among moreports than just the leaf ports. The total number of VLANs that can be used on each FEX port is limited by the maximum number ofP,V pairs that are available per leaf switch for host-facing ports on FEX. As of this writing, this numberis 10,000 per leaf switch, which means that, with 100 FEX ports, you can have a maximum of 100VLANs configured on each FEX port. At the time of this writing, the maximum number of encapsulations per FEX port is 20, which meansthat the maximum number of EPGs per FEX port is 20.The maximum number of FEX per leaf switch is 20. For more information about which leaf switch is compatible with which fabric extender, refer to the y/fexmatrix/fextables.htmlFor more information about how to connect a fabric extender to Cisco ACI, see the following c-Extender-with-Applica.htmlPhysical topologyAs of release 4.1, a Cisco ACI fabric can be built as a two-tier fabric or as a multi-tier (three-tiers) fabric.Prior to Cisco ACI 4.1, the Cisco ACI fabric allowed only the use of a two-tier (spine and leaf switch) topology,in which each leaf switch is connected to every spine switch in the network with no interconnection betweenleaf switches or spine switches.Starting from Cisco ACI 4.1, the Cisco ACI fabric allows also the use of two tiers of leaf switches, whichprovides the capability for vertical expansion of the Cisco ACI fabric. This is useful to migrate a traditionalthree-tier architecture of core-aggregation-access that have been a common design model for manyenterprise networks and is still required today. The primary reason for this is cable reach, where many hosts arelocated across floors or across buildings; however, due to the high pricing of fiber cables and the limitations ofcable distances, it is not ideal in some situations to build a full-mesh two-tier fabric. In those cases, it is more 2022 Cisco and/or its affiliates. All rights reserved.Page 10 of 218
efficient for customers to build a spine-leaf-leaf topology and continue to benefit from the automation andvisibility of Cisco ACI.Figure 2 Cisco ACI two-tier and Multi-tier topologyLeaf and spine switch functionsThe Cisco ACI fabric is based on a two-tier (spine and leaf switch) or three-tier (spine switch, tier-1 leaf switchand tier-2 leaf switch) architecture in which the leaf and spine switches provide the following functions: Leaf switches: These devices have ports connected to classic Ethernet devices, such as servers,firewalls, and router ports. Leaf switches are at the edge of the fabric and provide the VXLAN TunnelEndpoint (VTEP) function. In Cisco ACI terminology, the IP address that represents the leaf VTEP iscalled the Physical Tunnel Endpoint (PTEP). The leaf switches are responsible for routing or bridgingtenant packets and for applying network policies. Spine switches: These devices interconnect leaf switches. They can also be used to build a Cisco ACIMulti-Pod fabric by connecting a Cisco ACI pod to an IP network, or they can connect to a supportedWAN device (see more information in the "Designing external layer 3 connectivity" section). Spineswitches also store all the endpoints-to-VTEP mapping entries (spine switch proxies).Within a pod, all tier-1 leaf switches connect to all spine switches, and all spine switches connect to all tier-1leaf switches, but no direct connectivity is allowed between spine switches, between tier-1 leaf switches, orbetween tier-2 leaf switches. If you incorrectly cable spine switches to each other or leaf switches in the sametier to each other, the interfaces will be disabled. You may have topologies in which certain leaf switches arenot connected to all spine switches (such as in stretched fabric designs), but traffic forwarding may besuboptimal in this scenario.Leaf switch fabric linksUp until Cisco ACI 3.1, fabric ports on leaf switches were hard-coded as fabric (iVXLAN) ports and couldconnect only to spine switches. Starting with Cisco ACI 3.1, you can change the default configuration and makeports that would normally be fabric links, be downlinks, or vice-versa. More information can be found at thefollowing b ACIFundamentals/b ACI-Fundamentals chapter 010011.html#id 60593For information about the optics supported by Cisco ACI leaf and spine switches switches, use the followingtool: 2022 Cisco and/or its affiliates. All rights reserved.Page 11 of 218
https://tmgmatrix.cisco.com/homeMulti-tier design considerationsOnly Cisco Cloudscale switches are supported for multi-tier spine and leaf switches. Spine: EX/FX/C/GX spine switches (Cisco Nexus 9332C, 9364C, and 9500 with EX/FX/GX line cards) Tier-1 leaf: EX/FX/FX2/GX except Cisco Nexus 93180LC-EX Tier-2 leaf: EX/FX/FX2/GXDesign considerations for multi-tier topology include the following: All switch to switch links must be configured as fabric ports. For example, Tier-2 leaf switch fabricports are connected to tier-1 leaf switch fabric ports. A tier-2 leaf switch can connect to more than two tier-1 leaf switches, in comparison to a traditionaldouble-sided vPC design, which has only two upstream switches. The maximum number of ECMP linkssupported by a tier-2 leaf switch to tier-1 leaf switch is 18. An EPG, L3Out, Cisco APIC, or FEX can be connected to tier-1 leaf switches or to tier-2 leaf switches. Tier-1 leaf switches can have both hosts and tier-2 leaf switches connected on it. Changing from a tier-1 to a tier-2 leaf switch and back requires decomissioning and recommissioningthe switch. Multi-tier architectures are compatible with Cisco ACI Multi-Pod and Cisco ACI Multi-Site. Tier-2 leaf switches cannot be connected to remote leaf switches (tier-1 leaf switches). Scale: The maximum number of tier-1 leaf switches and tier-2 leaf switches combined is equal to themaximum number of leaf switches in the fabric (200 per pod; 500 per Cisco ACI Multi-Pod as of CiscoACI release 5.1).For more information about Cisco ACI multi-tier, see the following astructure/whitepaper-c11-742214.htmlPer leaf switch RBAC (Role Based Access Control)Up until Cisco ACI 5.0, an Cisco ACI fabric administrator could assign a tenant to a security domain to let usershave read/write privilege for a specific tenant assigned to that security domain, but that RBAC feature was notapplicable to specific leaf.Starting from Cisco ACI 5.0, a leaf switch can be assigned to a security domain so that only specific users canconfigure leaf switches assigned to that security domain and users in other security domains have no a
As a best practice, short-lived software releases should be upgraded to the next available long-lived release for stability and longer maintenance benefits. At the time of this writing, Cisco