Jumbo Frames

From trapsink.com
Jump to: navigation, search


Overview

Jumbo frames are the concept of opening up the Ethernet frames to a large MTU to be able to push large packets without fragmenting; this can be a common need on a private GigE switched network for Oracle RAC Interconnect between nodes. The accepted standard is 9000 (large enough for 8k payload plus packet overhead). Once your GigE switches have been configured for the new MTU, configure and test your servers.


Configuration

Example Setup:

  • Node A: 192.168.100.101 (bond1, eth1 + eth5)
  • Node B: 192.168.100.102 (bond1, eth1 + eth5)


RHEL/CentOS

On both nodes:

# ip link set dev bond1 mtu 9000
# vi /etc/sysconfig/networking-scripts/ifcfg-bond1
   add: MTU=9000


Basic Testing

Node A to B

# ifenslave -c bond1 eth5
# ip route get 192.168.100.102
# tracepath -n 192.168.100.102
# ping -c 5 -s 8972 -M do 192.168.100.102

# ifenslave -c bond1 eth1
# ip route get 192.168.100.102
# tracepath -n 192.168.100.102
# ping -c 5 -s 8972 -M do 192.168.100.102

Node B to A

# ifenslave -c bond1 eth5
# ip route get 192.168.100.101
# tracepath -n 192.168.100.101
# ping -c 5 -s 8972 -M do 192.168.100.101

# ifenslave -c bond1 eth1
# ip route get 192.168.100.101
# tracepath -n 192.168.100.101
# ping -c 5 -s 8972 -M do 192.168.100.101


Performance Testing

Using iperf (available via EPEL) for throughput measurements.

Node A to B

Node B: (receiver)
 # ifenslave -c bond1 eth5
 # iperf -B 192.168.100.102 -s -u -l 8972 -w 768k

Node A: (sender)
 # ifenslave -c bond1 eth5
 # iperf -B 192.168.100.101 -c 192.168.100.102 -u \
    -b 10G -l 8972 -w 768k -i 2 -t 30


Node B: (receiver)
 # ifenslave -c bond1 eth1
 # iperf -B 192.168.100.102 -s -u -l 8972 -w 768k

Node A: (sender)
 # ifenslave -c bond1 eth1
 # iperf -B 192.168.100.101 -c 192.168.100.102 -u \
    -b 10G -l 8972 -w 768k -i 2 -t 30

Node B to A

Node A: (receiver)
 # ifenslave -c bond1 eth5
 # iperf -B 192.168.100.101 -s -u -l 8972 -w 768k

Node B: (sender)
 # ifenslave -c bond1 eth5
 # iperf -B 192.168.100.102 -c 192.168.100.101 -u \
    -b 10G -l 8972 -w 768k -i 2 -t 30


Node A: (receiver)
 # ifenslave -c bond1 eth1
 # iperf -B 192.168.100.101 -s -u -l 8972 -w 768k

Node B: (sender)
 # ifenslave -c bond1 eth1
 # iperf -B 192.168.100.102 -c 192.168.100.101 -u \
    -b 10G -l 8972 -w 768k -i 2 -t 30


Protocol Overhead

Reference: Theoretical Maximums and Protocol Overhead:

Theoretical maximum TCP throughput on GigE using jumbo frames:

  (9000-20-20-12)/(9000+14+4+7+1+12)*1000000000/1000000 = 990.042 Mbps
    |   |  |  |     |   |  | | | |       |         |
   MTU  |  |  |    MTU  |  | | | |      GigE      Mbps
        |  |  |         |  | | | |
       IP  |  |  Ethernet  | | | |      InterFrame Gap (IFG), aka
    Header |  |    Header  | | | |      InterPacket Gap (IPG), is
           |  |            | | | |      a minimum of 96 bit times
         TCP  |          FCS | | |      from the last bit of the
      Header  |              | | |      FCS to the first bit of
              |       Preamble | |      the preamble
            TCP                | |
        Options            Start |
    (Timestamp)            Frame |
                       Delimiter |
                           (SFD) |
                                 |
                             Inter
                             Frame
                               Gap
                             (IFG)

Theoretical maximum UDP throughput on GigE using jumbo frames:
  (9000-20-8)/(9000+14+4+7+1+12)*1000000000/1000000 = 992.697 Mbps

Theoretical maximum TCP throughput on GigE without using jumbo frames:
  (1500-20-20-12)/(1500+14+4+7+1+12)*1000000000/1000000 = 941.482 Mbps

Theoretical maximum UDP throughput on GigE without using jumbo frames:
  (1500-20-8)/(1500+14+4+7+1+12)*1000000000/1000000 = 957.087 Mbps

Ethernet frame format:
  * 6 byte dest addr
  * 6 byte src addr
  * [4 byte optional 802.1q VLAN Tag]
  * 2 byte length/type
  * 46-1500 byte data (payload)
  * 4 byte CRC 

Ethernet overhead bytes:
  12 gap + 8 preamble + 14 header + 4 trailer = 38 bytes/packet w/o 802.1q
  12 gap + 8 preamble + 18 header + 4 trailer = 42 bytes/packet with 802.1q

Ethernet Payload data rates are thus:
  1500/(38+1500) = 97.5293 %   w/o 802.1q tags
  1500/(42+1500) = 97.2763 %   with 802.1q tags

TCP over Ethernet:
 Assuming no header compression (e.g. not PPP)
 Add 20 IPv4 header or 40 IPv6 header (no options)
 Add 20 TCP header
 Add 12 bytes optional TCP timestamps
 Max TCP Payload data rates over ethernet are thus:
  (1500-40)/(38+1500) = 94.9285 %  IPv4, minimal headers
  (1500-52)/(38+1500) = 94.1482 %  IPv4, TCP timestamps
  (1500-52)/(42+1500) = 93.9040 %  802.1q, IPv4, TCP timestamps
  (1500-60)/(38+1500) = 93.6281 %  IPv6, minimal headers
  (1500-72)/(38+1500) = 92.8479 %  IPv6, TCP timestamps
  (1500-72)/(42+1500) = 92.6070 %  802.1q, IPv6, ICP timestamps

UDP over Ethernet:
 Add 20 IPv4 header or 40 IPv6 header (no options)
 Add 8 UDP header
 Max UDP Payload data rates over ethernet are thus:
  (1500-28)/(38+1500) = 95.7087 %  IPv4
  (1500-28)/(42+1500) = 95.4604 %  802.1q, IPv4
  (1500-48)/(38+1500) = 94.4083 %  IPv6
  (1500-48)/(42+1500) = 94.1634 %  802.1q, IPv6