GUI Based VSAN Bootstrap VCSA Deployment

When deploying my 3 node NUC VSAN lab I got to try out the new bootstrap vcsa gui installer blogged about here…

Once you download and unpack the VCSA you’ll find a file called installer within the following path %filepath%\vcsa-ui-installer\win32\installer.exe (there’s a MAC & LINUX option too)

You need to have installed ESXi on your target host first, also if your networking is trunked to the ESXi host, make sure you tag the VM port group where the VCSA will reside before deploying.

By default this installer will enable the MGMT VMK for VSAN, this was fine with me as once the VCSA was deployed I retrospectively changed all the VSAN related host settings.

You will also need the VCSA to be resolvable by the DNS server you specify during the setup, I already had a Domain Controller deployed within workstation on my laptop which was routable from my lab network so I used that.

The Gui is relatively straight forward, ensure you select the option to install on a new virtual san cluster.




Select the disks on the host for the approproiate teir, if you want to enable Dedup do it now as the disk group will need to be evaucated to enable dedup and compression at a later date!


Do the traditional next next finish and it’ll start deploying. The VCSA deployment is now two step, once this stage is complete you need to log into the VCSA Mgmt page to complete the remainder of the setup.


Here is the proof of the pudding; my incredibly annoyingly named disks are now claimed by VSAN


The host is also now a member of a VSAN cluster.


A single host vsanDatastore


You can now complete the configuration through the vCenter

The Intel NUC\VSAN Home Lab

Having a home lab is a vital part of self-progression, I have always re-invested in myself, committing to self-study and home learning and it has helped me with career progression, so after my pizza box mistake I was in need of some toys to play with to continue that progression.

I looked into a number of options including Supermicro servers, HP Microservers and even building my own hosts in micro ATX cases but ultimately I don’t think anything gives you the same bang for their buck (or £ in my case) as the Intel NUC

The Intel nuc is the “official” unofficial VMware home lab!! There are posts all over the tinterweb from vSuperstars about why these bite size comps are ideal for a home lab, the downside being that they are not that cheap.

My BOM (bill of materials) doesn’t really differ from anyone else’s however if you’ve managed to stumble your way to my blog first, it goes like this






  • 3 x 5GB USB 3.0 thumb drives (To install ESXi on)

That lot cost me over £2400 (don’t tell the misses!!)

The above will build you a 3 Node all flash vSAN cluster (you will need a switch)


The only thing that probably differs in my Lab to others who have chosen NUCs is that I’m using an M.2 for the capacity tier and an SSD HD for the caching tier. The only reason for this was I still had the 60gb SSDs from my pizza box mistake so I put them to good use!

The NUC only has one on board NIC so the USB to Dual port Gigabit adapter will give me three NICs per NUC (try saying that fast with a mouth full of Doritos!)

Two NICs for vSAN and NFS traffic my “Storage NICS” and 1 for everything else (vMotion, Data, MGMT)

I already have a Buffalo Terastation and a Cisco 3750, which I used to complete the lab setup, the 3750 is used as layer 2 only, I created non-routable subnets for vMotion, vSAN, NFS & VXLAN (NSX is in the lab, more on that later)

Since all devices are patched into the same 3750 switch, I didn’t have to worry about routing those subs outside the switch but carving them up in their own VLAN helps limit broadcast traffic.

Mgmt and VM data sit on the same subnet, it’s not ideal… but it’s a lab.

The Terastation is used to house ISO\OVAs and maybe eventually the occasional low I/O VM, it has two NICs. One NIC is on the unrouteable NFS sub. The other NIC is on my mgmt.\data subnet so I can manage it!

All NUC NICS (USB & ONBOARD) support an MTU of more than 1600, which means I can have NSX in my lab, however unfortunately they don’t quiet support Jumbo MTU, again it’s a lab but in real world scenarios VSAN, NFS and probably vMotion too should be across a JUMBO frame enabled NIC. vSAN should also be backed ideally by a 10gb SWITCH when using all flash, but the 1GB Switch in my case works just fine… it’s a lab…just tell yourself “I wont buy a 10gb switch… I don’t need a 10gb switch…”

So “back of a fag packet” logical & physical designs…





There are a few things to consider when building the NUC lab.

  1. Some versions of ESXi don’t support the NUC/USB NICS so you’ll need to create a custom ISO image with the required VIBs. That MAY have changed with versions later than 6.5 u1 (I installed 6.5 U1) I’ve not kept in the loop with what does and doesn’t work. William Lams blog or maybe able to help you on that front.

My details to create a custom ISO for 6.5 U1 can be found here

  1. If you’re planning to use VSAN as your sole storage resource, then you’ll need to “bootstrap” VSAN to deploy your vCenter, as VSAN configuration is predominately done in vCenter and since you cant deploy a vCenter without storage, you get stuck in a chicken and egg scenario. William LAM has some great tech blogs on how to do this manually, but since vCenter 6.5 u1 there’s an option within the VCSA deployment gui to configure VSAN as you deploy the VCSA. I’ve briefly blogged about that here.

The LAB performs really well, I’ll throw up a bench marking blog at some point, but for now I hope you found the above useful!

Pizza box mistake…

Since leaving EMC and subsequently losing access to a lab, I had been mulling over a home lab for some time. Ultimately, what I wanted to do was keep upfront costs down. I built a 3 node DL360 G6 vSAN lab with 3 x 10K SAS HDs and a 60gb SATA SSD per node, 24gb RAM per node, 2 x Intel Xeon procs, it didn’t perform to shabby and all for a few hundred quid, pretty good buy or so I thought!

I’m not stupid (honest) I’d read horror stories about large electricity bills and thought “meh, how bad can it be”. Luckily Scottish power (my energy provider) are on hand with their year on year electricity comparison graph to show just how bad it can be, from 15.05kWh on average (over 92 days) to 27.03kWh on average (over a shorter 78 days)

My capex saving is now an opex cost…

Take my advice, avoid the pizza boxes… learn from my mistake, like UK Top Gear, I have been ambitious… but rubbish!

Now to start planning my next lab…

By the way, I have three DL360 G6s going if anyone wants to buy them?


VMware vCenter Workflow Manager service not starting

I had an issue where a vCenter update to 6.0u2a failed and rolled back, after a restart we were unable to power on VMs there were in a powered off state.

On further inspection we found that the “VMware vCenter workflow manager” service wasn’t started and would not start.

In the Workflow-manager.log log located here C:\ProgramData\Vmware\vcenter\logs\workflow\ I could see error relating to the service not starting and “Error Creating bean with name ‘jmxconnectorstarter’”


Scroll a little further in the Workflow-manager.log and you should see the phrase “Caused by”


My error stated “Port value out of range: -1”

Browsing to the file located here C:\ProgramData\Vmware\vCenterServer\cfg\vmware-vpx-workflow\conf


In this file the workflow.jmx.port was set to -1, this was changed to 19999 by VMware support and hey presto, the service started!

Powering on all VMs was then possible within the VC.

Moral of this story… name your beans!


vCenter 6u2a Upgrade failure with error 3010

I had an issue upgrading a VC to 6.0u2a recently, when running the upgrade I received the error “failed with error code 3010”


There’s a VMware KB here which explains the issue, however the resolution description is light on detail.

Check the pkgmgr-comp-msi.log to find out what DLL is locked; the log files will be in the ZIP file downloaded when the installation fails.

I found it useful to search for “is being held in use” within this log file to identify the dll being held.


In my case it was vmeventmsg.dll.

I stopped the vCenter Service (which in turn stops all other services that the VC is a decency of)

Using process explorer

Download –

Open process explorer and select “show details for all processes”  under file.


Under view ensure “Show processes from all users” is selected.


Select  Find > find handle or ddl (ctrl+f) and seach for the dll identified in the pkgmgr-comp-msi.log, the PID will be listed (in my case 912)


In process explorer kill the process that’s holding the dll hostage, by right clicking and selecting “Kill Process”


NOTE – I first tried using taskkill to end the process however it didn’t work, even though it would be listed when running tasklist.

Double check in processes explorer that the DLL is no longer in use.

Re-run the VC upgrade, go to the Winchester, have a nice cold pint, and wait for all of this to blow over.


Support of 3rd party vSwitches removed

VMware has announced the discontinuation of the 3rd party vSwitch program in vSphere 6.5 u1, most notably the Nexus 1000v,

When I say they’re discontinuing the program, I mean they’re actually removing the APIs 3rd party switches interact with! [ref 1]

Customers who remain on 5.5 or 6.0 and stick with the 1k will still be entitled to support.

Full disclosure, I’m an ex VCE (Now Dell EMC CPSD) employee. I was an Escalation Engineer responsible for Vblock support, Vblocks come pre-packaged with the Nexus 1000v. You’d think that level of exposure would lead me down a path of whimsical nostalgia… “It had its place” or “it’ll be missed” maybe phrases you’d expect an advocate of the Vblock to spout against the back drop of such news, however…  I’m ecstatic

My feelings about the 1K are best summed up by Moff from human traffic and his hatred for Peter Andre…

The vDS is more than adequate to fit the use cases that previously could only be meet by the 1K. As we move further into the world of converged infrastructure and software defined the 1K represents a time where infrastructure teams had stricter silos, where network engineers wanted more control over the virtual network.

Not to mention the hundreds of hours of downtime caused by server engineers trying to find their way around a nexus switch for the first time. (if you ever missed the word “add” when adding VLANs to the Uplink Port Profiles you’ll understand my pain)

VMware have a 1000v to vDS automated migration tool available on their download site, however the documentation shows a support matrix that’s a bit thin on the ground when it comes to supported 1K versions [ref 2]

At time of writing, it lists only 2 supported 1k Versions which I assume translate as follows.

2.2.3 = SV2(2.3)

3.1.3 = SV3(1.3)

Table 1‑1 Support Matrix

ESXi Version Nexus 1000v Version
5.5 2.2.3
5.1 2.2.3

If the above rings true, from a Vblock perspective only RCMs 5.0.5 > 5.0.8 are officially in scope/tested for this tool, although I’m sure the good people at CPSD have a plan\announcement pending for this relating to their own tool/guide/professional service to move away from the 1K.

Of course, it’s worth reiterating that this only becomes relevant if you intend to move to vSphere 6.5, an RCM based on 6.5 isn’t even GA yet…

Those that will stay on RCM 5.0.* or RCM 6.0 .* until tech refresh can sit back and relax knowing they can still get support.

The download also contains a manual guide to migrating away; the process is straight forward but could be lengthy if you have a large number of port-profiles to migrate to port groups.

For Vblock or for those using the 1k in a UCS environment, if you go down the manual migration route keep in mind what load balancing algorithm you can select. UCS only supports route based on virtual port ID or route based on source MAC hash.

Route based on IP Hash and route base on physical NIC Load are not supported in UCS [ref 3]




The Great Big Platform Services Controller Blog Post

This blog post is attempting to collate all the details on the platform services controller

First and foremost, let’s call out some things you can’t do, so you don’t read this whole blog only to find out your use case isn’t supported…


  • You CANT merge vSphere domains in vSphere 6.0, that is, if you have 2 vCenters with embedded (or external) PSCs that were deployed independently of one another, whether you used custom domain names or the default “vSphere.local” domain they can’t be merged into a single vSphere domain. [ref1]
  • You CANT migrate a vCenter server from one vSphere domain to another [ref1]
  • You CANT use enhanced linked mode between two separate vcenters in separate Domains, as enhanced link mode requires all PSCS to be in the same domain. [ref1]
  • Snapshots, this is a contentious one, if a PSC is replicating to other PSCs within the same site or cross sites then rolling back to previous snapshots are not supported as they can result in a PSC being out of sync with its sibling PSCs, this also applies to image level backups. This does not apply to standalone PSCs. [ref1] [ref6] [ref7]


  • You can migrate from an embedded PSC to an external PSC. There are some considerations when converting; certificates, integrated applications such as SRM, vRO, vRA need reconfiguring/repointing at the new PSC. When you migrate to an external PSC you use the cmsso-util reconfigure command rather than the smsso-util repoint command using “reconfigure” de-comms the embedded PSC. [ref2]
  • You can repoint a vCenter server to another PSC within the same site, providing of course that the PSC is in the same vSphere domain. [ref3]
  • You CAN repoint a vCenter server in SSO site 1 to a PSC in SSO site 2, providing of course that both sites are a member of the same vSphere domain! HOWEVER the pre-req for this is that vCenter Server is running 6.0 update 1 or later. THIS FUNCTION HAS SINCE BEEN REMOVED IN VSPHERE 6.5 so this is only supported in versions 0u1 – 6.5. Instructions on how to do this can be found in the references below. [ref4] In 6.5 or 6.0 before u1, if no functional PSC instance is available in the same site as the vCenter, then you must deploy or install a new PSC instance in this site as a replication partner of a functional PSC instance from another site.

PSC Maximums

There’s a great blog post by leading vCommunity legend William Lam on the PSC maximums in 6.5

Those numbers vary slightly from those in 6.0

vSphere 6.0 [ref7] vSphere 6.5 [ref8]
Maximum PSCS per vSphere Domain 8 10
Maximum Linked VCs 10 10
Maximum PSCS per site behind Load Balancer 4 4

Musing/Question – I’ve struggled to find the maximum number of sites within a vSphere Domain listed in any VMware documented, I’ll update this blog if I find anything conclusive.  Assuming at this point it’s limited by the number of PSCs allowed in a vSphere Domain

PSC Replication Topologies

When considering a PSC design, there are six high level PSC topologies VMware recommend:

  1. vCenter Server with Embedded PSC
  2. vCenter Server with External PSC
  3. PSC in Replicated Configuration
  4. PSC in HA Configuration
  5. vCenter Server Deployment Across Sites
  6. vCenter Server Deployment Across Sites with Load Balancer

Vmware have provided a handy decision tree for those yet to make their deployment decision, shout out to @Emad_younis & @eck79 [ref9]


PSCs are multi-master, but the default replication topology for the PSC is one to one, in a scenario where more than 2 PSCs exist within a vSphere domain, it is a recommendation to use a ring topology. This prevents a break in replication when a PSC fails.

Jeff Green in his virtual data cave has a good blog post below, which is relevant for 6.0 & 6.5

expanding a little, In my example below, we’ve 1 vSphere (SSO) Domain, 2 sites and 3 environments Prod, Non Prod & Development.  Each environment has physical infrastructure at both sites for BCDR purposes, each physical piece of infrastructure has a vCenter and a PSC. In this example, a ring topology would look like below, 2 PSC controllers at the same site would need to fail in order to break replication.


Where PSC partner replication occurs over the WAN from site 1 to site 2 it would add further resilience if replication partners PSC1 & PSC4 were on a different circuit to PSC3 & PSC6, this would ensure replication continues in the event of WAN circuit failure.

For example, if PSC2 fails, the VC can be repointed to either PSC1 or PSC3 within the same site while PSC2 is re-deployed


In addition, you could introduce load balancers into the environment and have all 3 PSCs behind the load balancer, this would prevent the manual re-pointing of a VC in the event of PSC failure. This design would be more suited to environments that would require a highly available VC.


If each environment was it’s own active directory domain, you could configure an identity source for each domain too (see below for more details) keeping all VCs in the same vSphere domain and minimising vSphere administration but simultaneously restricting access to environments by specific AD domain credentials.

Musing/Question – If both cross site partners use the same circuit, what happens if the cross site PSC replication can’t occur? All PSCs will be up and replicating intra-site but not cross-site. Will a split brain situation occur, if contradicting configurations are implemented at either site, what will take preference? I plan on creating this environment and testing; I’ll provide the results in a future blog and update this blog for reference.

PSC Replication Topology

  • It now handles the storing and generation of the SSL certificates within your vSphere environment.
  • It now handles the storing and replication of your VMware License Keys
  • It now handles the storing and replication of your permissions via the Global Permissions layer.
  • It now handles the storing and replication of your Tags and Categories.

Active Directory Trusts

Another good reference is what Microsoft Active Directory Trusts are supported and be used in vSphere SSO when using AD as an identity source [ref10]

All PSCs should be joined to Active Directory; PSCs in the same vSphere domain can be added to different Active Directory domains PROVIDING there is a trust relationship between the active directory domains. This will be relevant if you have a child Active directory domain for different GEO locations EG EMEA/AMER/APAC or environments EG prod/non prod/Dev and a different vSphere site within the same vSphere domain for each of those locations.

PSC ports

The required ports for PSC communication are listed in the following references, if you have firewalls between PSC nodes or between PSCs & vCenters then these ports will need to be opened [ref 10] [ref11]

The following ports are listening on a VC with embedded PSC and on a standalone external PSC


PSC replication occurs over TCP ports 389 & 2012 & UDP 389 according to [ref11]

NOTE: In a real world environment I have seen VCs that have been upgraded from 5.5 to 6 with embedded PSCs replicating to each other over a WAN on 11712 & 11711 (with a firewall in-between blocking 2012)

[Ref11] shows ports 11712 & 11711 as legacy and for 5.5 backwards compatibility only however I did find a reference here that lists 11711 & 11712 as vmdir.


It may be the case that as it was previously using 11711 & 11712 for replication, it’s continued to use these ports after an upgrade?

When migrating from the embedded PSC in VC 6 (updated from 5.5) to newly deployed external PSC, replication only occurred over TCP ports 389 & 2012 & UDP 389 between external PSCs

Migrating a vCenter from an embedded PSC to an external PSC, the considerations?

With a plain vanilla install of vSphere, where you’re only concerned with migrating to an external PSC the process is relatively straight forward.

The process becomes a little more intricate when other VMware solutions are registered against the embedded PSC, these solutions will need to be repointed to the new external PSC.

The most common solutions likely configured to use a PSC are vRO, SRM, NSX & vRA, 3rd party backup tools that plug into the vCenter may also have a separate configuration for the PSC. In a scenario where you want to De-Comm the embedded PSC, these solutions will need to be repointed to the external PSC.  In my experience, the PSC that these solutions point to has to be the same PSC that the VC points to.

********Update 01.09.2017********

So at the Vegas 2017 VMworld, there was some good discussions around the PSC that’s definitely worth a watch!!


You can use the following command to re-point a VC from 1 external PSC to another external PSC

cmsso-util repoint –repoint-psc externalPSC –username administrator –domain-name vsphere.local –passwd password

However when migrating away from an Embedded PSC, you want to use the following command which demotes the embedded PSC and then repoints to the external PSC.

cmsso-util reconfigure –repoint-psc externalPSC –username administrator –domain-name vsphere.local –passwd password

Troubleshooting and commands

The following commands can be used to check and test the PSC replication

  1. Find out what vSphere site a PSC is ink
  • VCSA

/usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name –server-name localhost

  • Windows

C:\Program Files\VMware\vCenter Server\vmafdd\vmafd-cli get-site-name –server-name localhost

  1. Find out the name of the vsphere domain
  • VCSA

/usr/lib/vmware-vmafd/bin/vmafd-cli get-domain –server-name localhost

  • Windows

C:\Program Files\VMware\vCenter Server\vmafdd\vmafd-cli get-domain –server-name localhost

  1. Find out what PSC a vCenter is point to
  • VCSA

/usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location –server -name localhost


  • Windows

C:\Program Files\VMware\vCenter Server\vmafdd\vmafd-cli get-ls-location –server -name localhost

  1. Show all Platform Services Controllers in the vsphere domain
  • VCSA

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h localhost -u administrator -w %password%

  • Windows

“%VMWARE_CIS_HOME%”\vmdird\vdcrepadmin -f showservers -h localhost -u administrator -w %password%

  1. Show replication partners with particular PSC
  • VCSA

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartners -h localhost -u administrator -w %password%


  • Windows

“%VMWARE_CIS_HOME%”\vmdird\vdcrepadmin -f showpartners -h localhost -u administrator -w %password%

  1. Show replication partner status
  • VCSA

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartnerstatus -h localhost -u administrator -w %password%


  • Windows

“%VMWARE_CIS_HOME%”\vmdird\vdcrepadmin -f showpartnerstatus -h localhost -u administrator -w %password%

  1. Create a PSC replication agreement
  • VCSA

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f createagreement -2 -h sourcepscfqdn -H destinationpscfqdn -u administrator -w %password%

  • Windows

“%VMWARE_CIS_HOME%”\vmdird\vdcrepadmin -f createagreement -2 -h sourcepscfqdn -H destinationpscfqdn -u Administrator -w %password%


  1. Remove PSC replication agreement
  • VCSA

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f removeagreement -2 -h sourcepscfqdn -H destinationpscfqdn -u administrator -w %password%

  • Windows

“%VMWARE_CIS_HOME%”\vmdird\vdcrepadmin -f removeagreement -2 -h sourcepscfqdn -H destinationpscfqdn -u Administrator -w %password%

If you’re having communication/replication problems with the PSC, perhaps you have firewalls in your environment then you can use the following tools to test port connectivity.

Curl is available for telnet using the following command [ref12]

Curl –v telnet://mypsc.domain.local:443

You can install tcpdump and netcat on the VCSA using the following commands



You can use VDC Admin Tool to test LDAP connectivity, force replication plus more…

You can find the tool here






You can use JXplorer to browse LDAP using the following settings.



If you can’t WINSCP into the VCSA you’ll need to change the root shell.

chsh -s /bin/bash root

You can change the shell back by running

Chsh –s /bin/appliancesh root

If you want to remove a PSC from the environment because it has failed you can use

cmsso-util unregister –node-pnid PSCNAME.LOCAL.DOMAIN –username administrator@vsphere.local –passwd %password%

You can also use

vdcleavefed -h PSCNAME.LOCAL.DOMAIN -u administrator -w %password%

The password for the root account of the VCSA expires after 365 days by default to set to infinity

chage -M -1 -E -1 root

To change the password at the CLI type


Then confirm your new password.

You can find PSC relevant logs in the following locations






List of references by link