It’s been a long time since I worked at a service provider, where MPLS L3 VPNs were used to transport isolated customer networks. The way I remember it operating back then was by using multiple MP-BGP sessions between PE routers to share VPNv4 routes for different customers. This provides tenant isolation, however the number of sessions (and usually operational effort) grows as the number of customers grow. Also, each time a new PoP with a PE router is brought up, the number of places to maintain this ever growing amount of config increases.
A modern solution to this config and session sprawl seems to be VXLAN BGP EVPN. EVPN is probably most commonly known for it’s Type 2 (MAC/IP Advertisement) routes, enabling stretching an L2 domain (MAC addresses) over an L3 fabric. However EVPN itself is just a BGP address family, a control plane, used to share TEP information to facilitate host reachability. And in the case of NSX, VXLAN BGP EVPN can be used to share reachability to IP hosts (VMs) using Type 5 (IP Prefix) routes with the physical network. This provides a way to maintain tenant isolation without having to add a new BGP session for every customer and a potentially offers simplier set of configurations all round.
The topology that will be built up in this post:
NOTE: This already assumes NSX-T is deployed and BGP is up between the physical network and the T0.
First, create a VNI Pool. Each VNI in the pool will be used as a transit VXLAN VNI for each VRF that will be created:
Under the T0 EPVN Settings, set the EVPN Mode to Inline. This uses the Edge Transport Nodes in both the EVPN control plane and the VXLAN data plane. The alternative (Route Server) mode still uses the Edge Nodes in the control plane, but uses the Host Transport Nodes directly in the data plane, with additional config overhead not covered here. Also select the VNI Pool previously created:
Then create an EVPN TEP for each Edge Node. Note these will be used in the VXLAN data plane, so should be reachable in the physical fabric and will be advertised out in the next step. GENEVE will still be used for nomal East-West NSX data plane traffic, but these new TEPs will run VXLAN for the North-South traffic.
Under the BGP Route Re-Distribution menu, select EVPN TEP IP to advertise the VTEPs:
The Edge TEPs are now being advertised into the physical network and the physical TEPs should be reachable from the Edges. This could be achieved with static routes on the Edges, via an IGP like OSPF or by redistributing them into the IPv4 eBGP domain as seen here:
Now to enable the MP-BGP EVPN address family for the T0 BGP neighbours (the IPs of your TOR switches):
Usually in an NSX-T deployment the uplink VLANs from the Edges to the physical network can remain at 1500 bytes. This is because they’re not used for carrying Overlay (GENEVE) traffic. However, as they’ll now be using VXLAN they’ll need to be upped to 1600.
Next, create each VRF and link them to the underlying T0. The RD can be auto generated (based on the T0 RD) but may be easier to identify if set manually. Each VRF should use a unique VNI from the pool previously created. This VNI should match that being used on the physical network. Here RED is using VNI1338001 with auto RD and BLUE is 1338002 with a manual RD:
As part of the VRF, set a Route Target import and export policy to ensure routes for the VRF, both NSX Segments and external physical networks, are included in the MP-BGP EVPN address-family:
Apply route re-distribution to each VRF like you would with a normal T0 e.g. advertising all T0 and T1 subnets:
Create a T1 Gateway for each VRF, link them to their respective T0 VRF and again advertise routes as required:
Create a Segment for each of the VRFs and attach to their respective T1 Gateways, note that these subnets could now overlap as they are separate VRFs:
We can now see the Segments advertised into the physical network as EVPN Type 5 routes. Note here the BLUE-WEB Segment with the manual RD is listed and the next-hops are the VXLAN TEP IPs of the NSX Edge Nodes:
And each is installed into its respective VRF table :
You can also view the RED and BLUE routing tables on the NSX Edges to see the externally learned routes that are also VRF-specific, extedning that multi-tenancy separation:
Attach a VM to one of the Segments and test ping from the VRF in the physical world:
Running a capture from the PE router, while pinging from the RED VM to the PE router shows the ICMP is inside a VXLAN packet, with the RED VNI of 1338001:
Note there is inherently no connectivity between VRFs. To enable comms between the two you could implement traditional route leaking on the physical network, or use static routes on the T0 router to prevent traffic even leaving the NSX domain.