Don’t Forget Overlay MTU Requirements!

There are several benefits to overlay networking, such as providing a larger number of segments to consume, for example over 16m for VXLAN and Geneve (both provide 24-bits for the VNI) or removing the reliance on physical network config to spin up a new subnet.

To make use of these benefits and more, the overlay doesn’t ask for much of the underlay, just a few basic things:

  • IP connectvity between tunnel endpoints
  • Any firewalls in the path allows UDP6081 for Geneve or UDP4789 for VXLAN
  • Jumbo frames with an MTU size of at least 1600 bytes
  • Optionally, multicast if you wish to optimise flooding

This article is about what happens when you don’t obey rule #3 above with NSX-T…

Topology

A basic 3-tier app has been deployed as above, with each tier on it’s own segment and connected to a T1 gateway. Routing between segments will be completely distributed in the kernel of each hypervisor that the workloads are on.

Problem

When connected to one of the WEB servers and attempting to access the APP, the connection was getting reset:

As the WEB and APP workloads were on separate hosts the traffic was being encapsulated into a Geneve packet (which cannot be fragmented) and sent over the transport network from TEP to TEP:

A ping test confirmed that connectivity was ok up to 1414 bytes, anything larger was being dropped:

root@NWATCH-WEB01 [ ~ ]# ping 10.250.70.1 -s 1414

PING 10.250.70.1 (10.250.70.1) 1414(1442) bytes of data.

1422 bytes from 10.250.70.1: icmp_seq=1 ttl=61 time=1.33 ms

^C

— 10.250.70.1 ping statistics —

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 1.326/1.326/1.326/0.000 ms

root@NWATCH-WEB01 [ ~ ]# ping 10.250.70.1 -s 1415

PING 10.250.70.1 (10.250.70.1) 1415(1443) bytes of data.

^C

— 10.250.70.1 ping statistics —

1 packets transmitted, 0 received, 100% packet loss, time 0ms

To prove the app layer was ok, a small page returning just one word was tested and worked fine:

Solved

As soon as the MTU size was increased on the underlay transport network to 1600 bytes the full webpage loaded fine:

And a ping for good measure:

root@NWATCH-WEB01 [ ~ ]# ping 10.250.70.1 -s 1415

PING 10.250.70.1 (10.250.70.1) 1415(1443) bytes of data.

1423 bytes from 10.250.70.1: icmp_seq=1 ttl=61 time=1.76 ms

1423 bytes from 10.250.70.1: icmp_seq=2 ttl=61 time=1.69 ms

^C

— 10.250.70.1 ping statistics —

2 packets transmitted, 2 received, 0% packet loss, time 2ms

rtt min/avg/max/mdev = 1.686/1.723/1.760/0.037 ms

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s