NSX Distributed Firewall Deep Dive

The following topics will be covered by this NSX DFW Deep dive:

NSX Distributed Firewall Overview:

NSX DFW is an distributed firewall spread over ESXi host and enforced as close to source of the VMs traffic (shown in each VM). The DFW runs as a kernel service inside the ESXi host.

With the NSX DFW we can enforce a stateful firewall service for VMs and the enforcement point will be at the VM virtual NICvNIC. Every packet that leaves the VM (before VTEP encapsulation) or enters the VM (After VTEP deencapsulation) can be inspected with a firewall policy.


The DFW runs inside the ESXi host as a kernel space module, resulting in an impressive throughput.

What makes the DFW an amazing feature is that as we add more ESXi host to vSphere cluster we increase the DFW throughput capacity.

The DFW rules can be based on Layer 2 up to Layer 4 and with 3-Party vendor integration the NSX can implement security features up and including L7.

  • L2 rules are based on MAC address L2 protocols like ARP, RARP and LLDP etc.
  • L3 rules are based on IP source destination and L4 uses a TCP or UDP service port.

The policy is created in centralized point at the vSphere vCenter server using vCenter web client. The objects used are being used from the vCenter inventory.

How NSX Distributed Firewall work:

 This section take from amazing NSX Design guide:

The DFW instance on an ESXi host (1 instance per VM vNIC) contains 2 separate tables:

Rule table: used to store all policy rules.

Connection tracker table: cache flow entries for rules with permit action.

Note: a specific flow is identified by the 5-tuple information Source IP address/Destination IP address/protocols/L4 source port/L4 destination port. Notice that by default, DFW does not perform a lookup on L4 source port, but it can be configured to do so by defining a specific policy rule.

Before exploring the use case for these 2 tables, let’s first understand how DFW rules are enforced:

DFW rules are enforced in top-to-bottom ordering. Each packet is checked against the top rule in the rule table before moving down the subsequent rules in the table. The first rule in the table that matches the traffic parameters is enforced Because of this behavior, when writing DFW rules, it is always recommended to put the most granular policies at the top of the rule table. This is the best way to ensure they will be enforced before any other rule.

DFW default policy rule (the one at the bottom of the rule table) is a “catch-all” rule: packet not matching any rule above the default rule will be enforced by the default rule. After the host preparation operation, the DFW default rule is set to ‘allow’ action. The main reason is because VMware does not want to break any VM to VM communication during staging or migration phases. However, it is a best practice to change the default rule to ‘block’ action and enforce access controls through a positive control model (only traffic defined in the firewall policy is allowed onto the network).

Let’s now have a look at policy rule lookup and packet flow:

An IP packet (first packet – pkt1) that matches Rule number 2 is sent by the VM. The order of operation is the following: Lookup is performed in the connection tracker table to check if an entry for the flow already exists.

As Flow 3 is not present in the connection tracker table (i.e miss result), a lookup is performed in the rule table to identify which rule is applicable to Flow 3. The first rule that match the flow will be enforced.

Rule 2 matches for Flow 3. Action is set to ‘Allow’.

Because action is set to ‘Allow’ for Flow 3, a new entry will be created inside the connection tracker table. The packet is then transmitted properly out of DFW.




DFW policy rule lookup and packet – subsequent packets.

Subsequent packets are processed in this order:

Lookup is performed in the connection tracker table to check if an entry for the flow already exists.

An entry for Flow 3 exists in the connection tracker table => Packet is transmitted properly out of DFW

One important aspect to emphasize is that DFW fully supports vMotion (automatic vMotion with DRS or manual vMotion). The rule table and the connection tracker table always follow the VM during vMotion operation. The positive result is there is no traffic disruption during workload moves and connections initiated before vMotion remain intact after the vMotion is completed. DFW brings VM movement freedom while ensuring continuous network traffic protection.

Note: this functionality is not dependent of Controllers or NSX Manager being up and available.

NSX DFW brings a paradigm shift that was not possible before: security services are no longer dependent on the network topology. With DFW, security is completely decoupled from logical network topology.

In legacy environments, to provide security services to a server or set of servers, traffic from/to these servers must be redirected to a firewall using VLAN stitching method or L3 routing operations: traffic must go through this dedicated firewall in order to protect network traffic.

With NSX DFW, this is no longer needed as the firewall function is brought directly to the VM. Any traffic sent or received by this VM is systematically processed by the DFW. As a result, traffic protection between VMs (workload to workload) can be enforced if VMs are located on same Logical Switch (or VDS VLAN-backed port-group) or on different Logical switches.


NSX DFW architecture:

The vCenter, NSX Manager and ESXi host are functioning as the 3 main components in this architecture.

DFW Architecture

DFW Architecture

NSX Manager: The NSX manager provides the single point of configuration and the REST API entry-points in a vSphere environment for NSX. The consumption of NSX can be driven directly via the NSX manager UI. In a vSphere environment this is available via the vSphere Web UI itself. Typically end-users tie in the network virtualization to their cloud management platform for deploying applications.

vCenter: VMware vCenter Server provides a centralized platform for managing your VMware vSphere environments so you can automate and deliver a virtual infrastructure with confidence.

ESXi host: VMware ESXi is the hypervisor running the virtual machines guest OS. DFW related modules:

  1. vShiled-Statefull-Firewal service daemon run in the user space
  2. vSIP run in the kernel space.

vShiled-Statefull-Firewal: Service demon Runs constantly on the ESXi host and performs multiple tasks:

  1. Interact with NSX Manager to retrieve DFW policy rules.
  2. Gather DFW statistics information and send them to the NSX Manager.
  3. Send audit logs information to the NSX Manager.
  4. Receive configuration from NSX manager to create (or delete) DLR Control VM, create (or delete) ESG.
  5. Part of the host preparation process SSL related tasks from NSX manager


Message Bus Client: The NSX Manager communicates with the ESXi host using a secure protocol called AMQP.

Advanced Message Queuing Protocol (AMQP) is an open standard application layer protocol for message-oriented middleware. The defining features of AMQP are message orientation, queuing, routing (including point-to-point and publish-and-subscribe), reliability and security”

Source: http://en.wikipedia.org/wiki/Advanced_Message_Queuing_Protocol.

RabbitMQ is the NSX AMQP implementation.

The vShiled-Statefull-Firewal is acting as a RabbitMQ Client in the ESXi. The vShiled-Statefull-Firewal is a user space service daemon and uses a TCP/5671 connection to the RabbitMQ server in the NSX manager. The message bus is used by the NSX Manager to send various information’s to the ESXi hosts:

Policy rules for the DFW module, controller nodes IP addresses, private key and host certificate to authenticate the communication between host and controller and requests to create/delete DLR instances.

vSIP: VMware Internetworking Service Insertion Platform. This is the distributed firewall kernel space module core component. The vSIP receives firewall rules from NSX manager (through vShiled-Statefull-Firewal) and downloads them down to each VM VMware-sfw.
Note: VMware Internetworking Service-Insertion Platform is also a framework that provides the ability to dynamically introduce 3rd party and VMware’s own virtual as well as physical security and networking services into VMware virtual network.

VPXA: A vCenter agent, installed on the ESXi host when the vCenter communicates with the ESXi host for first time. With the VPXA the vCenter manage the ESXi host for vSphere related tasks. Although it is not a direct part of the DFW architecture the VPXA is being used to report the VM IP address with VMtools. 


IOChains: VMware have a reserved IOchains handle packet process at the Kernel level.

Slot 0: DVFilter (Distributed Virtual Filter):

Distributed Virtual Filter DVFilteris is the VMkernel between the protected vNIC at SLOT 0 associated Distributed Virtual Switch (DVS) port, and is instantiated when a virtual machine with a protected virtual NIC gets created. It monitors the incoming and outgoing traffic on the protected virtual NIC and performs stateless filtering.

Slot 1: sw-sec (Switch Security): sw-sec module learns VMs IP and MAC address. sw-sec is critical component capture DHCP Ack and ARP broadcast message and forward this info as unicast to NSX Controller to perform the ARP suppression feature. sw-sec is the layer where NSX IP spoofgurd is implemented,


Slot-2: VMware-sfw: This is the place where DFW firewall rules are stored and enforced, VMware-sfw contains rules table and connections table.

NSX policy push process:

We create a security policy with vCenter GUI, this configuration then stored inside NSX manager. When we push the policy NSX Manager Will sends firewall rules in protobuf format to all vSphere clusters.

Protocol Buffers are a method of serializing structured data. As such, they are useful in developing programs to communicate with each other over a wire or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates from that description source code in various programming languages for generating or parsing a stream of bytes that represents the structured data.

Source: http://en.wikipedia.org/wiki/Protocol_Buffers

All ESXi host part from vCenter cluster will receive this policy with vShiled-Statefull-Firewal daemon over the message bus.At this point vShiled-Statefull-Firewal need to parse this rules and convert them from protobuf messages into the vmkernel vSIPIOCTL format, then vSIP will apply the rules on every VMs SLOT 2 VMware-sfw.

Note: This process flow describe “Applied To” filed in security policy is “Distributed Firewall” = rule applied everywhere.

A security administrator can create firewall rules built from vCenter objects like:

Cluster, DC, VDS port-group, Logical Switch, IPSets, Resource Pool, vAPP, VM, vNIC and Security Groups. The NSX firewall enforce point at the VMware-sfw can only understand IP address or MAC address.

In figure shows below we create a firewall rule to allow ICMP Source “Compute Cluster A to Destination “Compute Cluster B”:

Policy push process

Policy push process

The NSX Manager will need to figure out what are the object IDs represented by “Compute Cluster A” and “Compute Cluster B” and then reveal what the IP address correspond to those VMs.

The NSX firewall rules inside ESXi host are created as VSIPIOCTL format and then applied on the VMware-sfw.

The NSX manager relies on the vCenter internal database to get object-ID/IP address mapping, we can view the data with the vCenter MOB (Managed Object Browser) Using this url: https://vCenter_IP/mob/

The NSX manager keeps this info inside his internal database.

The vCenter server represents any object with a unique id. For example “Compute Cluster A” in fact equals to domain-c25 and “Compute Cluster B” equal to domain-c26. Here is screenshots from vCenter MOB:


 NSX DFW and VMtools:

The NSX (up to current version 6.1.3) relies on the VMtools to learn the IP address of the Guest OS. The IP address learned from the VMtools is stored in the vCenter database and reported to NSX manager. We can tell if VMtools reports the IP address in VM Summery screen:


To view the vm-id we can be retrieve using vCenter at the following path:
https://<vCenter server>/mob and select content -> rootFolder -> childEntity -> vmFolder.


To view the VMs IP address go to click on the vm-id from the list above, for example vm-36 (web-sv-01a).

     GuestInfo -> Net

In vCenter MOB “web-sv-01a” has Object ID: vm-36 with IP address “”.


If the VMTools was stopped or removed the vCenter removes the IP address entry immediately. An update notification will send to NSX manager cause to firewall module send a list updates to all the vShiled-Statefull-Firewal processes using protobuf format. If we configure firewall rules using vCenter objects (not IP address) as show in screenshot below, there will be a match on the last firewall rule (most of the time called catch-all rule).


If this rule configure to block (as in this example) then this VM will be blocked from the network, but if this rules send the permit then the VM gets a full network access, which allows the user to bypass security policy.

Spoofguard: NSX feature that we can use to eliminate the need of VMtools to learn the VM IP address but this will be address in a different blog post.



 NSX Firewall and vMotion:

vMotion: enables to move a VM from one ESXi host to another while this VM is powered on and connected to network in vSphere environment.

This feature can be managed automatic by the vSphere DRS mechanism or manually by the vSphere administrator.

As we saw in the NSX DFW architecture for each VM we have two separate tables:

DFW table: Contains the firewall rules.

DFW Connection table: Contains the live active (approved) connection passing through this VM. When VM1 vMotion process starts these two tables will follow the VM1 movement over the vMotion link.

NSX dFW vMotion

NSX dFW vMotion

When the vMotion process completes the VM1 will land at esxcomp-01b and have same firewall rules and same connection table and as a result there’s no traffic disruption for VM1.

NSX dFW after vMotion

NSX dFW after vMotion

Note: The NSX Manager is not involved in this vMotion since we don’t use the “Applied To” feature (Explained later).

NSX Firewall Applied To:

By default when we’re creating a firewall rule in NSX, the “Applied to” field is set to “Distributed Firewall”. The firewall rule will be stored in NSX manager’s database and will be applied to all VMs vNICs, regardless of the VMs location. It’s important to mention that even when dFW rule is applied to all VMs, we still need a match on source/destination to take action on that rules.

The Applied To field is determined by vSphere objects: Cluster, Datacenter, vDS Distributed PortGroup, Logical Switch, Edge, Host system, Security Group, VM, or even vNIC!

NSX Apply To option

NSX Apply To option

When we start using the “Applied To” field, NSX Manager will map the “Applied To object to the corresponding vSphere cluster. Only ESXi hosts in the cluster will receive this rule.

Each ESXi host that receives this rule will use vShiled-Statefull-Firewal demon to parse it and figure out which VMs need to apply it. When using the “Applied To” field, the perimeter scope limit is the vSphere Cluster.

The shows below “Rule ID” 1002 in which we’ve configured “Applied To” Distributed Firewall. (Default behavior)

NSX Apply To 3

When we will push this firewall rule, NSX manager will send this rule to all vSphere clusters.

As a result: all VMs will get rule id 1002 at their vNic level.

 NSX Apply To 1

Continue our example we add another rules ID 1005 in which we use the “Applied To” on web-sv-01a and web-sv-01b. “Rule ID” 1002 stay the same with “Applied To” Distributed Firewall.

NSX Apply To 2

Assuming we have the following machines: web-sv-01a, web-sv-02a, app-sv-01A and sec-mgr-01a. And we’ve configured the rules above.

If web-sv-01a and app-sv-01A is part of “Computer Cluster A”, web-sv-02a is part of “Computer Cluster B” and sec-mgr-01a run in “Management Edge Cluster”.

Now when we push this policy the NSX Manager need will figure rules boundaries for each cluster.

NSX Apply To 4

Base this rules calculation NSX manager will push the firewall update to only to “Compute Cluster A” and “Compute Cluster B” , “Management Edge Cluster” will not receive any update because is not any vSphere object part of the “Applied To” filed. When “Compute Cluster A” received this rule firewall update vSIP kernel module will need to pars which VM will need to apply rule 1005. only web-sv-01a will get update new rule id 1005 In addition to old rule id 1002. VM name app-sv-01a will not get any firewall rule update. When “Compute Cluster B” received the rule update all ESXi host will get the firewall update inside the cluster, vSIP demon will parse it but only the ESXi host run VM name we-sv-01b will applied rule id 1005.


Applied To” benefits:

  • Reduce the amount of rules per VMware-sfw, this improves efficiency because the DFW will have less rules to evaluate for every new session.
  • In case of an overlap IP address within multi-tenancy environment we must use “Applied To” to distinguish between one tenant and others.



Cross cluster vMotion with “Apply To”:

When we do use the “Applied To” feature and the VM traffic perform a vMotion across clusters then the NSX Manger will be involved in the process to update destination VM cluster with the relevant firewall rules. The NSX manager must be up to complete this operation.

For example VM name web-sv-01a need to vMotion from Compute cluster A to Management Edge Cluster, vCenter will send vMotion notification to NSX Manager, as a results NSX Manager will trigger policy push to all ESXi host in Management Edge Cluster. web-sv-01a will get same rule before vMotion occur with just change in the domain object from “domain-c25” to “domain-c7”

NSX Apply To 5


If the NSX Manager is down, No update rule will be push! When VM land on destination cluster no VM specific rule apply for that VM. Its important to note that when the NSX manager is down all existing VMs forwarding plane with DFW rules continue to work, only “New” VMs cannot have firewall rules until NSX Manager come back.

The NSX DFW keeps the rule table as a “.dat” file at the ESXi host at the following path:

/etc/vmware/vShiled-Statefull-Firewal/vsipfw_ruleset.dat .

Created in the cloud with Saaspose.Words. http://saaspose.com

NSX L2 to L4 Firewall:

The VMware NSX DFW can enforce security policy from L2 (Data Link Layer) to L4 (Transport Layer).

With L2 we can create DFW rules base on the MAC address or L2 protocol like: ARP,RARP,LLDP.

L3/L4 security rules can be enforced with a source/destination IP address or TCP/UDP ports.

VMware have list of 3-Party vendor (constantly growing list)


Default Policy:

The DFW enforce L2 rules before L3.

L2 Default policy: Fresh DFW installations will have a default policy, Which is a L2 policy with Source: Any, Destination: Any, Action: Allow


L3 Default Policy:

We have a default L3 policy with Source: Any, Distention: Any, Action: Allow


DFW Exclusion functionality:

Working on daily tasks with firewalls can sometimes lead to a situation where you end up blocking your access to the firewall management.
Regardless of the vendor you are working with, this is very challenging situation.
The end result of this scenario is that you are unable to access the firewall management to remove the rules that are blocking you from reaching the firewall management!

Think of a situation where you deploy a distributed firewall into each of your ESX hosts in a cluster, including the management cluster where you have your vCenter server located.

And then you change the default rule from the default “Allow” value to “Block” (as shown below):


What you’ve done by implementing this rule, can be shown in the following figure:

cut tree you sit on

Like the poor guy above dropping himself from his tree, by implementing this rule, you have blocked yourself from managing your vCenter.



OR: How can we protect ourselves from this situation?

Put your vCenter (and other critical virtual machines) in an exclusion list.
Any VM on that list will not receive any distributed firewall rules.
Go to the Network & security tab Click on NSX Manager


Exclusion VM list 1


Double click on the IP address object. In my example it is

Exclusion VM list 2


Click on Manage:

Exclusion VM list 3

Go in the “Exclusion List” tab and click on the green plus button.


Choose your virtual machine.


That’s it!  Now your VC is excluded from any enforced firewall rules.


Exclusion VM list 6


Restoring default firewall rules set:

We can use the NSX Manager REST API to revert to the default firewall rules set to overcome a mistake when we do not yet have access to the VC.

Perform a configuration backup at this stage.
By default the NSX Manager is automatically excluded from DFW, so it is always possible to send API calls to it.
Using a REST Client or cURL:


Submit a DELETE request to:



After receiving the expected code status 204 we will revert to the default DFW policy with default rule set to allow.


Now we can access our VC! . As we can see, we reverted to the default policy, but don’t panic :-)  as we saved the policy.


Click on the “Load Saved Configuration” button.


Load Saved Configuration before the last Saved.

Note: Every time we push dFW policy NSX automaticly save the policy. We have limit of 50 versions. 

Exclusion VM list 11

Accept the warning by click Yes.

Exclusion VM list 12

Now we have our last policy before we blocked our VC, it’s loaded but not applied.


We will need to change the last Rule from Block to Allow to fix the problem.


And Click “Publish the Changes”.



  • It’s not possible to disable the DFW functionality per vNIC, Exclusion List only allows to disable DFW functionality per VM.
  • The following list is automatically excluded from DFW functions, by default: The NSX Manager, NSX Controllers, Edge Service Gateway and Service VM (PAN FW for instance).


NSX and Application Level Gateway(ALG):

Application Level Gateway (ALG) is the ability of a firewall or a NAT device that can either allow or block applications that uses dynamic ephemeral ports to communicate. In the absence of ALG, it could be a nightmare for security and network administrators with the options of trade off between communication and security. A network administrator can suggest opening a large number of ports which would pose security threat for the network or the given server while a security administrator can suggest blocking all other ports except the known ports which again breaks the communication.ALG reads the network address found inside the application payload and opening respective ports for preceding communication and also synchronizing data across multiple sessions across different ports. For example: FTP uses different ports for session initiation/control connection and actual data transfers. An ALG would manage any information passed on the control connection as well as data connection in the above case.NSX-v acts as ALG for few protocols such as FTP, CIFS, ORACLE TNS, MS-RPC, SUN-RPC.


NSX DFW logs:

In NSX we have three different log types: System Events, Rule Events and Audit Messages and host Flows.

NSX Manager System Events

NSX Manager System Events are related to NSX operation like: FW configuration applied, Fail to publish FW configuration, Filter created, Filter deleted, VM added to Security Group.

For each System event have severity level:



To view the NSX System Events Go to Network & Security -> NSX Managers Click on the NSX Manager IP address -> Monitor -> System Events

Here is example for Critical event polled by NSX manager.

This event indicate vShiled-Statefull-Firewal demon went down on ESXi host id “host-38”.

We can view system event from the ESXi host itself. Here is example for FW configuration events can be view vShiled-Statefull-Firewal.log. this event cuase by  policy push from the NSX manager to ESXi host. The file location is: var/log/vShiled-Statefull-Firewal.log

Example for output:

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] Received vsa message of RuleSet, length 67

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] Processed vsa message RuleSet: 67

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] L2 rule optimization is enabled

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] applying firewall config to vnic list on host host-10

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] Enabling TCP strict policy for default drop rule.

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] sending event: applied vmware-sfw ruleset 1425955389291 for vnic 500e519a-87fd-4acd-cee2-c97c2c6291ad.000

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] successfully saved config to file /etc/vmware/vShiled-Statefull-Firewal/vsipfw_ruleset.dat

2015-03-10T02:43:12Z vShiled-Statefull-Firewal: [INFO] Sending vsa reply of domain-c7 host host-10: 0


NSX Manager Audit Events:

This log contains all the events related to: Admin login, FW configuration changes (pre and post change of the DFW rule).

To view the Audit Events Go to Network & Security -> NSX Managers Click on the NSX Manager IP address -> Monitor -> Audit Logs.

Here is Example for User bob login to system:


ESXi DFW host Rules Messages:

This log contains all the events related to: The DFW has dedicated log file (introduced in version 6.1) to view start/termination session and drop/pass packets. This logs contains the rule id associated vCenter objects.

File name is: dfwpktlogs.log

File location on the ESXi host:  /var/log/dfwpktlogs.log 

A log example: more /var/log/dfwpktlogs.log 

2015-03-10T03:22:22.671Z INET match DROP domain-c7/1002 IN 242 UDP>


1002 is the DFW rule-id

domain-c7 is cluster ID in the vCenter MOB. is the source IP destination IP


To view the Log filled we need to enable the “Log” option field.

By default when we create DFW rule there is no logging enabled. Logging occurs only after we enable the Log field on the firewall rules table

In order to see “Allow” or “Block” packet in the DFW logs files we need to change the “Log” field from “Do not log”  to “Log”.

In the following example we’re changing the last rule id 1002 from “Do no log” to “Log”:

In the next example for DFW log event we will see the results of a ping from my Control VM Management IP to

~ # tail -f /var/log/dfwpktlogs.log | grep

2015-03-10T03:20:31.274Z INET match DROP domain-c27/1002 IN 60 PROTO 1>

2015-03-10T03:20:35.794Z INET match DROP domain-c27/1002 IN 60 PROTO 1>


Live Flows:

With DFW we have ability to view live flows. These flows are pulled by the vShiled-Statefull-Firewal from the vSIP kernel module and aggregated appropriately. The NSX Manager pulls normal flows from vShiled-Statefull-Firewal every 5 minutes and realtime flows every 5 secs.

Enable the Flow Monitoring by clicking on “Flow Monitoring” -> Configuration and click on the Enable. Global Flow Collection should change to green, “Enabled” status

To View the vNIC flow go to “Live Flow” tab and browse for a specific VM and vNIC.

Click the start button

Live flow will show up in the screen. The refresh Rate is 5 second.

From ESXi host the /var/log/vsfwd.log file we can see the related events:

2015-03-18T03:20:01Z vShiled-Statefull-Firewal: [INFO] Received vsa message of FlowConfiguration, length 120

2015-03-18T03:20:01Z vShiled-Statefull-Firewal: [INFO] Processed vsa message FlowConfiguration: 120

2015-03-18T03:20:01Z vShiled-Statefull-Firewal: [INFO] Loaded flow config: [120]

2015-03-18T03:21:29Z vShiled-Statefull-Firewal: [INFO] Received message in request queue of topic FlowRequest

2015-03-18T03:21:29Z vShiled-Statefull-Firewal: [INFO] Received vsa message of FlowRequest, length 52

2015-03-18T03:21:29Z vShiled-Statefull-Firewal: [INFO] Processed vsa message FlowRequest: 52

2015-03-18T03:21:29Z vShiled-Statefull-Firewal: [INFO] rmqRealTimeFlowDataRetrieve started

2015-03-18T03:21:29Z vShiled-Statefull-Firewal: [INFO] Done with configuring start of real time flows

2015-03-18T03:21:29Z vShiled-Statefull-Firewal: [INFO] rmqRealTimeFlowDataPush started


Service Composer:

Service Composer helps you to provision and assign network and security services to applications in a virtual infrastructure.

You map these services to a security group, and the services are applied to the virtual machines in the security group

Security group

With NSX DFW we have the ability to group vCenter elements such as VMs to container called security groups. Base of this security groups we can built DFW rules. The of what vSphere object can be part of the security group can be dynamic or static.

Security group can be consume directly in to firewall tab without use the service composer.


Security Group = (Dynamic Inclusion + Static Inclusion) – Static Exclusion

Creating Security groups base of VM name example:

Go to Network & Security -> Service Composer -> Security Groups



Dynamic inclusion:

A list of dynamic options for inclusion.

In the example below we chose “VM name” for any VM contains the word “web”:


Static Inclusion:

We can select the object type that we want to (permanently) include as part of this security group.

With static inclusion we can create nested security groups.

  • Nested groups are group inside groups.

In the following example we will not exclude any object:

 Static exclusion:

We can select what is the object type we want to permanently exclude as part of this security group.



Summary of Security Group:


We can view the security group object members from the “Security Groups tab.

In our example, the “Web Servers” security has two VMs: web-sv-01 and web-sv-02a and this group membership is handled dynamically because the criteria is that they have “web” as part of their VM name.

If a new VM, called web-sv-03a” is being created in this vCenter it will automatically be part of this security group.


Security policy using security groups:

We can create a firewall rule that leverage this new security group:

Source can be any and destination will be the “Web Servers” security group.

The service can be one L4 service or group of services, here we chose HTTPS. 


We can apply this policy to any cluster running NSX “Distributed Firewall”.

We can apply this policy on security groups, which will result in the rule (policy) applied only to VMs that are part of this security groups.

The security rule:

vSphere environments are dynamic by nature. When new VMs join (or leave) a security group (ex. “Web Servers”) the vCenter will update his database and as a result an update notification will be sent to the NSX manager. This notification will trigger a list of updates to the VMs that are part of this security group due to being part of the “Applied To”.  Security administrator do not need to constantly update the policy rules for every time new VMs join or leave the network. NSX firewall policy can automatically reflect these changes and this is the reason using vCenter objects is so powerful



Security tag:

To keep the virtualization flexibility, without compromise security, VMware invented security tag.

By adding new tag attribute to VMs we can apply security policy.  Adding or removing tag to VM can be done dynamically by automation, 3-Party or even manual.

We can use the example we used above in which we created a security group called Web Servers”, but rather than use a VM name containing “web” as the criteria for this VM group membership, we can attach a security tag to this VM.

Create Manual Security tag:


Create the security tag name:

We have a user defined security tag with 0 VM count:


Manual apply Security Tag:

Select the “web servers” security tag name and right click.


Filter “web” from the list and choose web-sv-01a and web-sv-02a from list:

Now we can modify the “Web servers” security policy to use a dynamic including criteria:


After this changed we have two VM count with security tag “web servers”.


The security groups “Web serverscontains the VM names: web-sv-01a and web-sv-02a.



In the VMs summary page we can see the security tags that are applied to this VM and to which security group this VM belongs. From the “web-sv-01a” example:

Security Policy:

Using security policy we can create templates containing DFW policy approved security admin this is “how you want to protect” your environment, then apply this on security groups “WHAT you want to protect”. Security policy may contain traffic redirection rules to 3rd-party vendors for service chaining.

Security policy is part of the service composer building blocks.

We can apply a security policy to more than one security group.

In the example below we apply “Web Security Policy” to both “Security Group 1” and Security Group 2”.




A different option is to apply two different security policies to same security groups.

This can result in a contradiction between the policies.

For example we apply “Security Policy 1 and Security Policy 2 to “WEB Security Groups”:

The security policy precedency will be with a “Weight” value, configured by the security admin

In the following example we demonstrate this when we create two different security policies: “Allow ICMP SP” and “Allow HTTP SP” and apply both to the previously created security group “Web Servers.

Create “Allow ICMP SP:


In create “Firewall Rules”: The “Weight is 4300

  • The related action is Allow
  • The source filled is: any
  • Destination is: “Policy Security Groups”.

The following is the interesting part: Due to the fact that this security policy works as a template we may reuse it for different security groups and our goal is to avoid tying this template to a specific security group.

Service: ICMP ECHO Request, ICMP ECHO Replay.


Ready to compute and click Finish:


At the firewall tab we can note that at this stage:

  • We have not applied this security policy to any security groups and so this policy has not been activated yet. We can see it as the gray policy in the firewall tab.
  • There is no security group in the designation:


Create Allow WEB SP” is the same way:

Notice that the “Weight field is 1300, which is lower than the previous “Allow ICMP SP” 4300

Cerate the WEB rule (same flow as above):


The firewall policy order shows “Allow ICMP” before “Allow WEB”

Now we apply both security policies on the same security group, using “Apply Policy”:

Choose “Allow ICMP Security Policy”:

And do the same for the second security policy called “Allow WEB SP”.

In the “Security Policy tab view we can see the results of this action:

From the “Firewall tab we can see that now we have two activated service composer security rules.

In the service composer canvas view we have an excellent summery of the security services which were applied to the security group:


3rd party vendor service integration

The NSX DFW can integrate with 3rd party vendors to achieve a higher level of application security level with different services. The partner needs to register their service on NSX manager.

We define the traffic redirection policy in service composer.

For example, we can redirect the traffic that leaves/enters the VMs to a third party partner product device for inspection.

We can define traffic redirection in two different places.

Using “Partner Security Service”:

In this example with a Palo Alto firewall we define the “any” source traffic , designated to PAN-SG-WEB security group, will be redirected to the PAN firewall:


Using security policy:


We follow the same policy definition construct as DFW (i.e. same options for source field, destination field and services field) and the only difference is in the action field: instead of Block/Allow/Reject, a user can select between redirect/no redirect followed by a partner list (any partner that has been registered with NSX and that has been successfully deployed on the platform).Finally: Log options can be enabled for this traffic redirection rule.

Install NSX DFW:

NSX DFW Pre-requirements:

Table 1 list vSphere pre-requirements for NSX DFW

vCenter ESXi host NSX Manager VMtools vSphere Switch
5.5 or later 5.1,5.5 6.0 or later VMtoool must install and run on VM guest OS if DFW policy base on vCenter objects.
VMtools can be Any version
vMware Distributed switch (vDS)
version 5.1 or later.
VSS is not supported


It’s imported to mention that NSX DFW can work on VXLAN port-group or VLAN port-group.  Enable dFW on vSS is not tested by VMware and No supported mean if you enable it, it may work.

NSX Controller is not required with DFW. NSX Controller is only required for VXLAN and Logical Distributed Router.

The NSX DFW installation is done through the Host preparation process.

The NSX Manager triggers the NSX kernel modules installation inside a vSphere cluster and builds the NSX Control plan fabric.

Note: Before the host preparation process we need to complete the following:

  • Registering the NSX Manager with vCenter.
  • Deploying the NSX Controllers.

Three components are involved during the NSX host preparation: vCenter, NSX Manager, EAM (ESX Agent Manager).

Host Preperation1

vCenter Server:
Management of vSphere compute infrastructure.

NSX Manager:
Provides the single point of configuration and REST API entry-points in a vSphere environment for NSX.

EAM (ESX Agent Management):
The middleware component between the NSX Manager and the vCenter. The EAM is part of the vCenter and is responsible to install the VIBs (vSphere Installation Bundles), which are software packages prepared to be installed inside an ESXi host.


The host preparation begins when we click the “Install” button in the vCenter GUI.

  • This process is done at the vSphere Cluster level and not per ESXi host.
  • The EAM will create an agent to track the VIB’s installation process for each host. The VIB’s are being copied from the NSX Manager and cached in EAM. If the VIBs are not present in the ESXi host, the EAM will install the VIBs (ESXi host reboot is not needed at the end of this process).
  • During an NSX software upgrade, the EAM also takes care of removing the installed old version of the VIBs but an ESXi host reboot is then needed.

VIBs installed during host preparation:

  • esxdvfilter-switch-security
  • esx-vsip
  • esx-vxlan

Once the host preparation was successfully completed the ESXi host has a fully working Control Plane.

 Two control plan channels will be created:

  • RabbitMQ (RMQ) Message bus: Provides communication between the vShiled-Statefull-Firewal process on the ESXi hypervisor and the NSX Manager over TCP/5671.
  • User World Agent (UWA) process (netcpa on the ESXi hypervisor): Establishes TCP/1234 connection over SSL communication channels to the Controller Cluster nodes.

Troubleshooing DFW installation:

The NSX DFW installation is actually the host preparation process.

We have few examples for host preparation issues.

DNS Issues:

EAM fails to deploy VIBs due to misconfigured DNS or no DNS configuration on host.
We can verify if those DFW VIBs have been successfully installed by connecting to each ESXi host in the cluster and issuing the command “esxcli software vib list”.

~# esxcli software vib list | grep esx-vsip

esx-vsip                       5.5.0-0.0.2318233                     VMware  VMwareCertified   2015-01-24

~ # esxcli software vib list | grep dvfilter

esxdvfilter-switch-security   5.5.0-0.0.2318233                     VMware  VMwareCertified   2015-01-24


In this case, we may get a status of “Not Ready”:

Not Ready

The message clearly indicates “Agent VIB module not installed” on one or more hosts.

We can check the vSphere ESX Agent Manager for errors:

vCenter home > vCenter Solutions Manager > vSphere ESX Agent Manager”

On “vSphere ESX Agent Manager”, check the status of “Agencies” prefixed with “_VCNS_153”. If any of the agencies has a bad status, select the agency and view its issues:


We need to check the associated log  /var/log/esxupdate.log (on the ESXi host) for more details on host preparation issues.
Log into the ESXi host in which you have the issue, run “tail /var/log/esxupdate.log” to view the log

esxupdate error1

From the log it appears suddenly clear that the issues may be related to DNS name resolution.

Configure the DNS settings in the ESXi host for the NSX host preparation to succeed.


TCP/80 from ESXi to vCenter is blocked:

The ESXi host is unable to connect to vCenter EAM on TCP/80:

Could be caused by a firewall blocking communication on this port. From the ESXi host /var/log/esxupdate.log file:

esxupdate: esxupdate: ERROR: MetadataDownloadError: (‘http://VC_IP_Address:80/eam/vib?id=xxx-xxx-xxx-xxx), None, “( http://VC_IP_Address:80/eam/vib?id=xxx-xxx-xxx-xxx), ‘/tmp/tmp_TKl58’, ‘[Errno 4] IOError: <urlopen error [Errno 111] Connection refused>’)”)

The NSX-v has a list of ports that need to be open in order for the host preparation to succeed.
The complete list can be found in:

Existing VIB’s Version

If an old VIBs version exists on the ESXi host, EAM will remove the old VIB’s
But the host preparation will not automatically continue.

We will need to reboot the ESXi host to complete the process (this condition will be clearly indicated next to the host name on vCenter).


ESXi bootbank space issue:

If you try Upgrade ESXi 5.1u1 to ESXi 5.5 and then start NSX host preparation you may face issue and from /var/log/esxupdate log file you will see message like:
Installationerror: the pending transaction required 240MB free space, however the maximum size is 239 MB”
I faced this issue in a customer ISO of IBM blade but may appear in other vendors.

Install fresh ESXi 5.5 Custom ISO. (This is the version I upgraded too)



If the vCenter runs on a Windows machine, other applications can be installed and already using port 80, causing a conflict with EAM port tcp/80.

For example: By default IIS server use TCP/80

Use a different port for EAM:

Changed the port to 80 in eam.properties in \ProgramFiles\VMware\Infrastructure\tomcat\webapps\eam\WEB-INF\


Download VIBs link:

The NSX manager has a direct link to download the VIB’s as zip file:



Reverting installation:

Reverting a NSX Prepared ESXi Host requires the following steps:

  • Remove the host from the vSphere cluster.
  • Put ESXi host in maintenance mode and remove the ESXi host from the cluster. This will automatically uninstall NSX VIBs.

Note: ESXi host must be rebooted to complete the operation.

Manually Uninstall VIBs:

The following commands can be entered directly on the ESXi host to remove the installed NSX VIBs:

esxcli software vib remove -n esx-vxlan

esxcli software vib remove -n esx-vsip

esxcli software vib remove -n dvfilter-switch-security

Note: The ESXi host must be rebooted to complete the operation

DFW (UWA) agent issues:

The VIBs installation completes successful but on rare occasions one or both user world agents is not functioning correctly. This could manifest itself as either:

  • The firewall showing a bad status Error for example.
  • The control plane between the hypervisor(s) and the controllers is being down

UWA error

Validate Message bus service is active on NSX Manager:

Check the messaging bus userworld agent status by running the command/etc/init.d/vShieldStateful-Firewall status on the ESXi hosts



vShiled-Statefull-Firewal is the service daemon part of UWA (User Web Agent) running on ESXi host.

To check if vShiled-Statefull-Firewal daemon is working properly, issue the following CLI

~ # ps | grep vShiled-Statefull-Firewal

36169 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36170 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36171 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36172 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36173 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36174 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36175 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36176 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36178 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36179 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal

36909 36169 vShiled-Statefull-Firewal                /usr/lib/vmware/vsfw/vShiled-Statefull-Firewal


The ESX shows the threads this way, these are not processes but are threads.

The vShiled-Statefull-Firewal provided activates are performed by several threads:


  • Firewall Rule publishing, Flow monitoring, NetX config thread, heart beat, threshold monitoring, ipfix, netcpa proxy etc… Are all supported vShiled-Statefull-Firewal activities that are run by these threads.

Run the below command on the ESXi hosts to check for active messaging bus connection:

esxcli network ip connection list | grep 5671 (Message bus TCP connection)

network connection

Please ensure that port 5671 is opened for communication in the any external network firewall.

Logs are recorded under /var/log/vswfd.log. If the Message Bus is communicating properly with the VSM, you should see logs as follows (Heartbeats):

2015-03-10T14:10:34Z vShiled-Statefull-Firewal: [INFO] Received HeartBeart Seq 2545 , Sending response

2015-03-10T15:22:34Z vShiled-Statefull-Firewal: [INFO] Received HeartBeart Seq 2569 , Sending response

2015-03-10T16:34:34Z vShiled-Statefull-Firewal: [INFO] Received HeartBeart Seq 2593 , Sending response

2015-03-10T17:46:34Z vShiled-Statefull-Firewal: [INFO] Received HeartBeart Seq 2617 , Sending response



Since this is a module which operates at the kernel level, it is highly unlikely that the module would fail as it gets loaded as a part of the boot image. However, in case of any failures of the distributed firewall functionality; for an instance an ESXi host maxed out on the CPU, the traffic would be blocked by default and packets would start dropping for the VMs which are protected.

In event when vShiled-Statefull-Firewal is down:

          You would see “messaging infrastructure down on host” in System Events (with host name)

          New rules will not get pushed to the host. DFW UI would indicate last publish operation is pending (as opposed to succeeded …true even if its one out of 100 host that push failed).

          In all cases, Enforcement of all/any of already programmed rules will Never stop.

          If the vShiled-Statefull-Firewal crashes on the host,

          A watchdog process will restart it automatically.

          The downtime involved will be not observable as the restart is pretty quick.

          Every time a vShiled-Statefull-Firewal restart occurs, the NSX manager is contacted to sync all the rules info to make sure the state is in sync between the NSX manager and the host.

         If the vShiled-Statefull-Firewal is stopped manually on the host (i.e “/etc/init.d/vShieldStateful-Firewall stop”). Then there is no attempt to restart the process.


esxcfg-advcfg -l | grep Rmq

Run this command on the ESXi hosts to show all Rmq variables –there should be 16 variable in total

esxcfg-advcfg -g /UserVars/RmqIpAddress

Run this command on the ESXi hosts, it should display the NSX Manager IP address



DFW Kernel Space:

Verify that the Kernel Module was loaded to memory:

VSIP (VMware Internetworking Service Insertion Platform) is the distributed firewall kernel module component

Command to check if distributed firewall kernel module is successfully installed on the host:


~ # vmkload_mod -l | grep vsip

vsip                     13   452


This command display all IOChains and firewall filter on ESXi host.


Fast Path = traffic filter in the Kernel module

Slow Path = Traffic redirected to 3-Party vendor like PaloAlto . In this screenshot we can see there is no Slow Path.

Filters: tied display for etch vNIC in this esxi host what Slot he belong to.


In Fastpath we can see the filter orders start from SLOT 0 dvfilter-faulter.


~ # summarize-dvfilter


agent: dvfilter-faulter, refCount: 1, rev: 0x1010000, apiRev: 0x1010000, module: dvfilter

agent: ESXi-Firewall, refCount: 5, rev: 0x1010000, apiRev: 0x1010000, module: esxfw

agent: vmware-sfw, refCount: 2, rev: 0x1010000, apiRev: 0x1010000, module: vsip

agent: dvfilter-generic-vmwareswsec, refCount: 2, rev: 0x1010000, apiRev: 0x1010000, module: dvfilter-switch-security

agent: bridgelearningfilter, refCount: 1, rev: 0x1010000, apiRev: 0x1010000, module: vdrb

agent: dvfilter-generic-vmware, refCount: 1, rev: 0x1010000, apiRev: 0x1010000, module: dvfilter-generic-fastpath

agent: dvfg-igmp, refCount: 1, rev: 0x1010000, apiRev: 0x1010000, module: dvfg-igmp




port 50331654 vmk2

  vNic slot 0

   name: nic-0-eth4294967295-ESXi-Firewall.0

   agentName: ESXi-Firewall

   state: IOChain Attached

   vmState: Detached

   failurePolicy: failOpen

   slowPathID: none

   filter source: Invalid


port 50331655 vmk3

  vNic slot 0

   name: nic-0-eth4294967295-ESXi-Firewall.0

   agentName: ESXi-Firewall

   state: IOChain Attached

   vmState: Detached

   failurePolicy: failOpen

   slowPathID: none

   filter source: Invalid


world 35677 vmm0:web-sv-02a vcUuid:’50 26 b7 4d c5 6c 1e d9-47 c0 09 25 95 80 2f ad’

 port 50331656 web-sv-02a.eth0


vNic slot 2

   name: nic-35677-eth0-vmware-sfw.2

   agentName: vmware-sfw

   state: IOChain Attached

   vmState: Detached

   failurePolicy: failClosed

   slowPathID: none

   filter source: Dynamic Filter Creation


vNic slot 1

   name: nic-35677-eth0-dvfilter-generic-vmware-swsec.1

   agentName: dvfilter-generic-vmwareswsec

   state: IOChain Attached

   vmState: Detached

   failurePolicy: failClosed

   slowPathID: none

   filter source: Alternate Opaque Channel


Packet capture command:


When VM connect to Logical switch NSX will implement different IOChains security services in the VM packet flow. Each service represents with unique slot id.

IN the figure shows below we have VM name Web-sv-01a part of ESXi host name esxcomp-01a. web-sv-01a have tree different IOChains before the VM packet get to vDS port-group. This IOChains handle VM packet process at the Kernel level.


Slot 0: DVFilter (Distributed Virtual Filter):

Distributed Virtual Filter DVFilteris is the VMkernel between the protected vNIC at SLOT 0 associated Distributed Virtual Switch (DVS) port, and is instantiated when a virtual machine with a protected virtual NIC gets created. It monitors the incoming and outgoing traffic on the protected virtual NIC and performs stateless filtering.

Slot 1: sw-sec (Switch Security): sw-sec module learns VMs IP and MAC address. sw-sec is critical component capture DHCP Ack and ARP broadcast message and forward this info as unicast to NSX Controller to perform the ARP suppression feature. sw-sec is the layer where NSX IP spoofgurd is implemented,



Slot-2: VMware-sfw: This is the place where DFW firewall rules are stored and enforced, VMware-sfw contains rules table and connections table.

With vSphere we can capture VMs traffic with command pktcap-uw, for this example we send continuous ping (ICMP echo request) packet from web-sv-01a. The capture command will need to be place on IOChain SLOT 2 with appropriate filter name for web-sv-01a.

To find the exact filter name we need to use the command summarize-dvfilter.

We can grep the exact name with the –A 3 switch mean show 3 line more after the grep term found.

From ESXi host name esxcomp-01a:

~ # summarize-dvfilter | grep web-sv-01a  –A 3

world 35682 vmm0:web-sv-01a vcUuid:’50 26 c7 cd b6 f3 f4 bc-e5 33 3d 4b 25 5c 62 77′

 port 50331656 web-sv-01a.eth0

  vNic slot 2

   name: nic-35682-eth0-vmware-sfw.2

   agentName: vmware-sfw


From this output we can see that the filter name is nic-35682-eth0-vmware-sfw.2 for SLOT 2

pktcap-uw command help with -A output:

esxcomp-01a # pktcap-uw -A

Supported capture points:

        1: Dynamic — The dynamic inserted runtime capture point.

        2: UplinkRcv — The function that receives packets from uplink dev

        3: UplinkSnd — Function to Tx packets on uplink

        4: Vmxnet3Tx — Function in vnic backend to Tx packets from guest

        5: Vmxnet3Rx — Function in vnic backend to Rx packets to guest

        6: PortInputPort_Input function of any given port

        7: IOChain — The virtual switch port iochain capture point.

        8: EtherswitchDispath — Function that receives packets for switch

        9: EtherswitchOutput — Function that sends out packets, from switch

        10: PortOutput — <

Related Blog post:

An introduction to Zero Trust virtualization-centric security

What is a Distributed Firewall?

Deep Dive: How does NSX Distributed Firewall work

Distributed Firewall (DFW) in NSX for vSphere, and “Applied To:”

Security-as-a-Service with NSX Service Composer

Configure and Administer Firewall Services

Service Composer – Resultant Set of Policy

Validating Distributed Firewall rulesets in NSX

Stateful Firewall and NSX


Thanks to Tiran Efrat and Francis Guillier for reviewing this document and answering some of the questions during creating this document.

Posted in Design, Firewall Tagged with: , , , , , , , , , , , , , , , ,