NSX Home LAB Part 2


 NSX Controller


This Post updated at 18.10.14

NSX Controller Overview

Controller

Controller

The NSX control plane runs in the NSX controller. In a vSphere-optimized environment with VDS the controller enables multicast free VXLAN and control plane programming of elements such as Distributed Logical Routing (DLR).

In all cases the controller is purely a part of the control plane and does not have any data plane traffic passing through it. The controller nodes are also deployed in a cluster of odd members in order to enable high-availability and scale.

The Controller role in NSX Architecture are:

  • Enables VXLAN control plane by distributing network information.
  • Controllers are clustered for scale out in odd number’s (1,3) and high availability.
  • TCP (SSL) server implements the control plane protocol
  • An extensible framework that supports multiple applications:
  • Currently VXLAN and Distributed Logical Router
  • Provides CLI interface for statistics and runtime states
  • Clustering, data persistence/replication, and REST API framework from NVP are leveraged by the controller

This next overview of NSX Controller was taken from great work of Max Ardica and Nimish Desai in the official NSX Design Guide:

The Controller cluster in the NSX platform is the control plane component that is responsible in managing the switching and routing modules in the hypervisors. The controller cluster consists of controller nodes that manage specific logical switches. The use of controller cluster in managing VXLAN based logical switches eliminates the need for multicast support from the physical network infrastructure. Customers now don’t have to provision multicast group IP addresses and also don’t need to enable PIM routing or IGMP snooping features on physical switches or routers.

Additionally, the NSX Controller supports an ARP suppression mechanism that reduces the need to flood ARP broadcast requests across the L2 network domain where virtual machines are connected. The different VXLAN replication mode and the ARP suppression mechanism will be discussed in more detail in the “Logical Switching” section.

For resiliency and performance, production deployments must deploy a Controller Cluster with multiple nodes. The NSX Controller Cluster represents a scale-out distributed system, where each Controller Node is assigned a set of roles that define the type of tasks the node can implement.

In order to increase the scalability characteristics of the NSX architecture, a “slicing” mechanism is utilized to ensure that all the controller nodes can be active at any given time.

 Slicing Controller

The above illustrates how the roles and responsibilities are fully distributed between the different cluster nodes. This means, for example, that different logical networks (or logical routers) may be managed by different Controller nodes: each node in the Controller Cluster is identified by a unique IP address and when an ESXi host establishes a control-plane connection with one member of the cluster, a full list of IP addresses for the other members is passed down to the host, so to be able to establish communication channels with all the members of the Controller Cluster. This allows the ESXi host to know at any given time what specific node is responsible for a given logical network

In the case of failure of a Controller Node, the slices for a given role that were owned by the failed node are reassigned to the remaining members of the cluster. In order for this mechanism to be resilient and deterministic, one of the Controller Nodes is elected as a “Master” for each role. The Master is responsible for allocating slices to individual Controller Nodes and determining when a node has failed, so to be able to reallocate the slices to the other nodes using a specific algorithm. The master also informs the ESXi hosts about the failure of the cluster node, so that they can update their internal information specifying what node owns the various logical network slices.

The election of the Master for each role requires a majority vote of all active and inactive nodes in the cluster. This is the main reason why a Controller Cluster must always be deployed leveraging an odd number of nodes.

Controlloer Nodes Majority

Figure above highlights the different majority number scenarios depending on the number of Controller Cluster nodes. It is evident how deploying 2 nodes (traditionally considered an example of a redundant system) would increase the scalability of the Controller Cluster (since at steady state two nodes would work in parallel) without providing any additional resiliency. This is because with 2 nodes, the majority number is 2 and that means that if one of the two nodes were to fail, or they lost communication with each other (dual-active scenario), neither of them would be able to keep functioning (accepting API calls, etc.). The same considerations apply to a deployment with 4 nodes that cannot provide more resiliency than a cluster with 3 elements (even if providing better performance).

Note: NSX currently (as of software release 6.1) supports only clusters with 3 nodes. The various examples above with different numbers of nodes were given just to illustrate how the majority vote mechanism works.

NSX controller nodes are deployed as virtual appliances from the NSX Manager UI. Each appliance is characterized by an IP address used for all control-plane interactions and by specific settings (4 vCPUs, 4GB of RAM) that cannot currently be modified. Downsizing NSX Controller

In order to ensure reliability to the Controller cluster, it is good practice to spread the deployment of the cluster nodes across separate ESXi hosts, to ensure that the failure of a single host would not cause the loss of majority number in the cluster. NSX does not currently provide any embedded capability to ensure this, so the recommendation is to leverage the native vSphere DRS anti-affinity rules to avoid deploying more than one controller node on the same ESXi server.

For more information on how to create a VM-to-VM anti-affinity rule,example of rule:

NSX Management Cluster and DRS Rules

Anti-Affinity

please refer to the following KB article:

http://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.vsphere.resmgmt.doc/GUID-7297C302-378F-4AF2-9BD6-6EDB1E0A850A.html

Deploying NSX Controller

From the NSX Controller Menu we click green + button

Deploying Controller

Deploying Controller

Add Controller window pop-up, we will place the controller at the Management Cluster.

Capture2

The IP Pool box needed for automatically  allocated ip address for etch controller Node.

Capture3

After click ON the NSX Manager will Deploy Controller Node at the Management Cluster.

Capture4

we will need to wait until the node status change from Deploying to Normal

Capture5

At this point we have one Node in the NSX Cluster of Controller’s.

vCapture7

if you have problem to deploy the controller read my post:

Deploying NSX-V controller failed

We can now SSH to Controller and run some show commands:

show control-cluster status

nvp-controller # show control-cluster status
Type Status Since
——————————————————————————–
Join status: Join complete 04/16 02:55:19
Majority status: Connected to cluster majority 04/16 02:55:11
Restart status: This controller can be safely restarted 04/16 02:55:17
Cluster ID: 14d6067f-c1d2-4541-ae45-d20d1c47009f
Node UUID: 14d6067f-c1d2-4541-ae45-d20d1c47009f

Role Configured status Active status
——————————————————————————–
api_provider enabled activated
persistence_server enabled activated
switch_manager enabled activated
logical_manager enabled activated
directory_server enabled activated

Join Status: when this node join to the cluster and the status of the join, in this output we get “Join completed”

Check which Node are UP and running and ip addres’s

show control-cluster startup-nodes

nvp-controller # show control-cluster startup-nodes
192.168.78.135

Controller Cluster and High Availability

Controller Cluster

Controller Cluster

For testing We will install 3 Node to join as one Controller Cluster.

Install Controller Cluster

Install Controller Cluster

From the output of the startup node we can see we have 3 node’s.

show control-cluster startup-nodes

nvp-controller # show control-cluster startup-nodes
192.168.78.135, 192.168.78.136, 192.168.78.137

One of the node member will elect as Master.

In order to  see which node member elected as Master we can run the command:

show control-cluster roles

Master Election

Master Election

Node 1 chose as Master.

Now lest check what will happen if we restart Node 1.

Restart Node1

Restart Node1

After few sec the Node 2 elected as Master:

Node2 Elected as Master

Node2 Elected as Master

in order to save memory at my laptop i will leave node1 and delete node2 and node3.

want to know more about how to TSHO NSX controllers ?

read my post

Troubleshooting NSX-V Controller

Summery of Part 2

We Install NSX Controller and see the functionality of the high availability of the NSX cluster.

Lab topology:

Summary of home lab part 2

Summary of home lab part 2

Related Post:

Troubleshooting NSX-V Controller// // // <![CDATA[
var amznKeys = amznads.getKeys();
if (typeof amznKeys != “undefined” && amznKeys != “”) { for (var i =0; i // // // // // //

NSX Controller

Host Preparation

Logical Switch

Distributed Logical Router

Posted in Controller, Home Lab, Install Tagged with:
3 comments on “NSX Home LAB Part 2
  1. NSX4me says:

    Hi!
    I can’t figure out of you can have a dVS stretched over two clusters (computes and management) (last pic of this blog post). Thanks

    • roie9876@gmail.com says:

      Hello Oliver,
      The Transport Zone create this “magic”, even if we have two different vDS,one vDS for payload and other vDS for Management.
      Every time we create new logical switch, the transport zone will create two backed dvportgroup on each vDS for same logical switch.

Leave a Reply