NSX Load Balancing

This next overview of Load Balancing  was taken from great work of Max Ardica and Nimish Desai in the official NSX Design Guide:

Overview

Load Balancing is another network service available within NSX that can be natively enabled on the NSX Edge device. The two main drivers for deploying a load balancer are scaling out an application (through distribution of workload across multiple servers), as well as improving its high-availability characteristics

NSX Load Balancing

NSX Load Balancing

The NSX load balancing service is specially designed for cloud with the following characteristics:

  • Fully programmable via API
  • Same single central point of management/monitoring as other NSX network services

The load balancing services natively offered by the NSX Edge satisfies the needs of the majority of the application deployments. This is because the NSX Edge provides a large set of functionalities:

  • Support any TCP applications, including, but not limited to, LDAP, FTP, HTTP, HTTPS
  • Support UDP application starting from NSX SW release 6.1.
  • Multiple load balancing distribution algorithms available: round-robin, least connections, source IP hash, URI
  • Multiple health checks: TCP, HTTP, HTTPS including content inspection
  • Persistence: Source IP, MSRDP, cookie, ssl session-id
  • Connection throttling: max connections and connections/sec
  • L7 manipulation, including, but not limited to, URL block, URL rewrite, content rewrite
  • Optimization through support of SSL offload

Note: the NSX platform can also integrate load-balancing services offered by 3rd party vendors. This integration is out of the scope for this paper.

In terms of deployment, the NSX Edge offers support for two types of models:

  • One-arm mode (called proxy mode): this scenario is highlighted in Figure below and consists in deploying an NSX Edge directly connected to the logical network it provides load-balancing services for.
One-Arm Mode Load Balancing Services

One-Arm Mode Load Balancing Services

The one-armed load balancer functionality is shown above:

  1. The external client sends traffic to the Virtual IP address (VIP) exposed by the load balancer.
  2. The load balancer performs two address translations on the original packets received from the client: Destination NAT (D-NAT) to replace the VIP with the IP address of one of the servers deployed in the server farm and Source NAT (S-NAT) to replace the client IP address with the IP address identifying the load-balancer itself. S-NAT is required to force through the LB the return traffic from the server farm to the client.
  3. The server in the server farm replies by sending the traffic to the LB (because of the S-NAT function previously discussed).

The LB performs again a Source and Destination NAT service to send traffic to the external client leveraging its VIP as source IP address.

The advantage of this model is that it is simpler to deploy and flexible as it allows deploying LB services (NSX Edge appliances) directly on the logical segments where they are needed without requiring any modification on the centralized NSX Edge providing routing communication to the physical network. On the downside, this option requires provisioning more NSX Edge instances and mandates the deployment of Source NAT that does not allow the servers in the DC to have visibility into the original client IP address.

Note: the LB can insert the original IP address of the client into the HTTP header before performing S-NAT (a function named “Insert X-Forwarded-For HTTP header”). This provides the servers visibility into the client IP address but it is obviously limited to HTTP traffic.

Inline mode (called transparent mode) requires instead deploying the NSX Edge inline to the traffic destined to the server farm. The way this works is shown in Figure below.

Two-Arms Mode Load Balancing Services

Two-Arms Mode Load Balancing Services

    1. The external client sends traffic to the Virtual IP address (VIP) exposed by the load balancer.
    2. The load balancer (centralized NSX Edge) performs only Destination NAT (D-NAT) to replace the VIP with the IP address of one of the servers deployed in the server farm.
    3. The server in the server farm replies to the original client IP address and the traffic is received again by the LB since it is deployed inline (and usually as the default gateway for the server farm).
    4. The LB performs Source NAT to send traffic to the external client leveraging its VIP as source IP address.

    This deployment model is also quite simple and allows the servers to have full visibility into the original client IP address. At the same time, it is less flexible from a design perspective as it usually forces using the LB as default gateway for the logical segments where the server farms are deployed and this implies that only centralized (and not distributed) routing must be adopted for those segments. It is also important to notice that in this case LB is another logical service added to the NSX Edge already providing routing services between the logical and the physical networks. As a consequence, it is recommended to increase the form factor of the NSX Edge to X-Large before enabling load-balancing services.

     

    In terms of scalability and throughput figures, the NSX load balancing services offered by each single NSX Edge can scale up to (best case scenario):

    • Throughput: 9 Gbps
    • Concurrent connections: 1 million
    • New connections per sec: 131k

     

    In below are some deployment examples of tenants with different applications and different load balancing needs. Notice how each of these applications is hosted on the same Cloud with the network services offered by NSX.

Deployment Examples of NSX Load Balancing

Deployment Examples of NSX Load Balancing

Two final important points to highlight:

  • The load balancing service can be fully distributed across This brings multiple benefits:
  • Each tenant has its own load balancer.
  • Each tenant configuration change does not impact other tenants.
  • Load increase on one tenant load-balancer does not impact other tenants load-balancers scale.
  • Each tenant load balancing service can scale up to the limits mentioned above.

Other network services are still fully available

  • The same tenant can mix its load balancing service with other network services such as routing, firewalling, VPN.

 

One Arm Load Balance Lab Topology

In this One Arm Load Balance Lab Topology we have a 3-tiers application built from:

Web servers: web-sv-01a (172.16.10.11), web-sv-02a (172.16.10.12)

App: app-sv-01a (172.16.20.11)

DB: db-sv-01a (172.16.30.11)

We will add to this lab NSX Edge service gateway (ESG) for load balancer function.

The ESG (highlighted with the red line) is deployed in one-arm mode and exposes the VIP 172.16.10.10 to load-balance traffic to the Web-Tier-01 segment.

One-Armed Lab topology

 

Configure One Arm Load Balance

Create NSX Edge gateway:

One-Arem-1

Select Edge Service Gateway (ESG):
One-Arem-2

Set the Admin password, enable SSH and Auto rule:

One-Arem-3

Install the ESG in Management Cluster:

One-Arem-4

In our lab appliance size is Compact, but we should choose the right size according to amount of traffic expected:

One-Arem-5

Configure the Edge interface and IP address; since this is one-arm mode we have only one interface:

One-Arem-6

Create default gateway

One-Arem-8

Configure default accept fw rule:

One-Arem-9

Complete the installation:

One-Arem-10

Verify ESG is deployed::

One-Arem-11

Enable Load Balance in the ESG, go to Load Balance and click Edit:

One-Arem-12

Check mark “Enable Load Balancer”

One-Arem-13

Create the application profile:

One-Arem-14

Add a name, in the Type select HTTPS and Enable SSL Passthrough:

One-Arem-15

Create the pool:

One-Arem-16

In the Algorithm select ROUND-ROBIN, monitor is default https, and add two servers member to monitor:

One-Arem-16h

To add Members click on the + icon, the port we monitor is 443:

One-Arem-17

We need then to create the VIP:

One-Arem-18

In this step we glue all the configuration parts, tie the application profile to pool and give it the Virtual IP address:

One-Arem-19

Now we can check that the load balancer is actually working by connecting to the VIP address with a client web browser.

In the web browser, we point to the VIP address 172.16.10.10.

The results is to hit 172.16.10.11 web-sv-01a:

One-Arem-verification-1

When we try to refresh our web browser client we see we hit 172.16.10.12 web-sv-02a :

One-Arem-verification-2

Troubleshooting One Arm Load Balance

General Loadbalancer troubleshooting workflow

Review the configuration through UI

Check the pool member status through UI

Do online troubleshooting via CLI:

  • Check LB engine status (L4/L7)
  • Check LB objects statistics (vips, pools, members)
  • Check Service Monitor status (OK, WARNING, CRITICAL)
  • Check system log message (# show log)
  • Check LB L4/L7 session table
  • Check LB L7 sticky-table status

 

Check the configuration through UI

 

 

One-Arem-TSHOT-1

 

  1. Check the pool member status through UI:

 

One-Arem-TSHOT-2

Possible errors discovered:

  1. 80/443 port might be used by other services (e.g. sslvpn);
  2. Member port and monitor port are miss configured hence health check failed.
  3. Member in WARNING state should be treated as DOWN.
  4. L4 LB is used when:
    a) TCP/HTTP protocol;
    b) no persistence settings and L7 settings;
    c) accelerateEnable is true;
  5. Pool is in transparent mode but Edge doesn’t sit in the return pat

Do online troubleshooting via CLI:

Check LB engine status (L4/L7)

# show service loadbalancer

Check LB objects statistics (vips, pools, members)

# show service loadbalancer virtual [vip-name]

# show service loadbalancer pool [poo-name]

Check Service Monitor status (OK, WARNING, CRITICAL)

# show service loadbalancer monitor

Check system log message

# show log

Check LB session table

# show service loadbalancer session

Check LB L7 sticky-table status

# show service loadbalancer table

 

 

One-Arm-LB-0> show service loadbalancer
<cr>
error Show loadbalancer Latest Errors information.
monitor Show loadbalancer HealthMonitor information.
pool Show loadbalancer pool information.
session Show loadbalancer Session information.
table Show loadbalancer Sticky-Table information.
virtual Show loadbalancer virtualserver information.

#########################################################

One-Arm-LB-0> show service loadbalancer
———————————————————————–
Loadbalancer Services Status:

L7 Loadbalancer : running
Health Monitor : running

#########################################################

One-Arm-LB-0> show service loadbalancer monitor
———————————————————————–
Loadbalancer HealthMonitor Statistics:

POOL                               MEMBER                                  HEALTH STATUS
Web-Servers-Pool-01  web-sv-02a_172.16.10.12   default_https_monitor:OK
Web-Servers-Pool-01  web-sv-01a_172.16.10.11   default_https_monitor:OK
One-Arm-LB-0>

##########################################################

One-Arm-LB-0> show service loadbalancer virtual
———————————————————————–
Loadbalancer VirtualServer Statistics:

VIRTUAL Web-Servers-VIP
| ADDRESS [172.16.10.10]:443
| SESSION (cur, max, total) = (0, 3, 35)
| RATE (cur, max, limit) = (0, 6, 0)
| BYTES in = (17483), out = (73029)
+->POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| SESSION (cur, max, total) = (0, 3, 35)
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-02a_172.16.10.12, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 1, 7)
| | BYTES in = (7233), out = (29320)

####################################################################
One-Arm-LB-0> show service loadbalancer pool
———————————————————————–
Loadbalancer Pool Statistics:

POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| SESSION (cur, max, total) = (0, 3, 35)
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-02a_172.16.10.12, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 1, 7)
| | BYTES in = (7233), out = (29320)

##########################################################################

One-Arm-LB-0> show service loadbalancer session
———————————————————————–
L7 Loadbalancer Current Sessions:

0x5fe50a2b230: proto=tcpv4 src=192.168.110.10:49392 fe=Web-Servers-VIP be=Web-Servers-Pool-01 srv=web-sv-01a_172.16.10.11 ts=08 age=8s calls=3 rq[f=808202h,i=0,an=00h,rx=4m53s,wx=,ax=] rp[f=008202h,i=0,an=00h,rx=4m53s,wx=,ax=] s0=[7,8h,fd=13,ex=] s1=[7,8h,fd=14,ex=] exp=4m52s
0x5fe50a22960: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=09 age=0s calls=2 rq[f=c08200h,i=0,an=00h,rx=20s,wx=,ax=] rp[f=008002h,i=0,an=00h,rx=,wx=,ax=] s0=[7,8h,fd=1,ex=] s1=[7,0h,fd=-1,ex=] exp=20s
———————————————————————–

 

Disconnect web-sv-01a_172.16.10.11 from the network

 

 

One-Arem-TSHOT-3

From the GUI we can see the effect in members pool status:

One-Arem-TSHOT-4

 

One-Arm-LB-0> show service loadbalancer virtual
———————————————————————–
Loadbalancer VirtualServer Statistics:

VIRTUAL Web-Servers-VIP
| ADDRESS [172.16.10.10]:443
| SESSION (cur, max, total) = (0, 3, 35)
| RATE (cur, max, limit) = (0, 6, 0)
| BYTES in = (17483), out = (73029)
+->POOL Web-Servers-Pool-01
| LB METHOD round-robin
| LB PROTOCOL L7
| Transparent disabled
| SESSION (cur, max, total) = (0, 3, 35)
| BYTES in = (17483), out = (73029)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-01a_172.16.10.11, STATUS: DOWN
| | STATUS = DOWN, MONITOR STATUS = default_https_monitor:CRITICAL
| | SESSION (cur, max, total) = (0, 2, 8)
| | BYTES in = (8882), out = (43709)
+->POOL MEMBER: Web-Servers-Pool-01/web-sv-02a_172.16.10.12, STATUS: UP
| | STATUS = UP, MONITOR STATUS = default_https_monitor:OK
| | SESSION (cur, max, total) = (0, 1, 7)
| | BYTES in = (7233), out = (29320)

Posted in Design, Install, Load Balancing Tagged with: ,
6 comments on “NSX Load Balancing
  1. jatinyona says:

    Hi Roie, Quick question on NSX Load Balancing.
    Do you know if ESG can be used to load balance between a VM and a physical server and if it can be done will it be supported by VMware?
    Thanks in advance, Jatin

    • roie9876@gmail.com says:

      Hello jatinyona,
      if the Edge can communicate with the server (VLAN or VXLAN) he can load balance. but you need to know how the traffic get back to Edge, Does the Edge act Inline to traffic ? does Edge act as default gateway to physical server ? regard supported you need to ask GSS or PM …

  2. Edwin Ma says:

    Hi ! Roie

    If customer just purchased vCenter standard version, there would be only standard vswitch. I think we still can deploy Load Balancer at Edge. But there would be no VXLAN, just VLAN . Right ?

    • UNIVIRT says:

      Hi Edwin. I think you mean vSphere Standard? The version of vCenter is not really relevant. But, then again, neither is the version of vSphere because all NSX licenses also include the license to enable distributed virtual switch. As a result, you can implement NSX on any version of vSphere from Standard up to Enterprise Plus. I’ve even heard it said that you could implement it on Essentials Plus.

      It is a commonly held misunderstanding that if you want to use NSX you need to have vSphere Enterprise Plus – not true.

  3. ankur goyal says:

    Do we need to dedicate a NSX edge for load balancer, which means that in one-arm mode traffic always hairpins through this node ? Or is it fully distributed where the load balancer function is provided in the local hypervisor ?

  4. manoj vp says:

    One-Arm Mode for TCP the Packet walk is as you have mentioned, However When I configure for UDP the Packet walk is not as mentioned below any Idea.

    TCP Packet Walk

    Client-Ip > VIP
    Edge-IP >Server-IP
    Server-IP > Edge-IP
    VIP > Client-IP

    UDP Packet Walk

    Client-IP > VIP
    Client-IP > server-IP

    Since it is syslog server there is no return traffic

    is it expected behavior if yes can you provide more details.

1 Pings/Trackbacks for "NSX Load Balancing"
  1. […] NSX Load Balancing by Roie Ben Haim – NEW! […]

Leave a Reply