One week ago NSX-T version 2.4.1 (Build 13716575) was released. Dozens of resolved issues are listed in the release notes. The process of upgrading a deployment is depicted in this post.
First step is to download the 7,5 GB upgrade bundle file and upload it in the first screen of the NSX-T GUI’s Upgrade section:
After the upload is complete the bundle is extracted and its compatibility matrix is checked. Afterwards the upgrade process can be started:
The obligatory End User License Agreement has to be accepted as usual:
First step in the upgrade process is to upgrade the “Upgrade Coordinator” component:
When this step is completed three boxes with the current and new versions for the hosts, edges and management nodes are displayed:
It is recommended to run the pre-checks first, which check if the environment correctly configured for the further upgrade steps, e.g. whether the vSphere clusters are configured for DRS:
When the pre-checks are completed successfully you can proceed to the second step of the ugprade process which is upgrading the hosts. All of the hosts known to NSX via Fabric/Nodes are displayed and grouped according to their clusters in vCenter. The order of the hosts in each group can be changed, as can the upgrade order (parallel or one after the other). The upgrade mode “Maintenance” is recommended for productive environment, which evacuates (vMotion) each host while placing it in maintenance mode before installing the new NSX VIBs. For test deployments the “In-place” upgrade mode can be selected, which might lead to service interuptions of the network functions offered by NSX to the running VMs.
The overall group upgrade order defines whether the host groups should be upgraded simultaneously:
During the upgrade the invidual status of each group can observed by clicking on it:
When all hosts are upgraded you can contine to the next step by clicking on “Next”:
All edge VMs have to be part of an edge cluster as those correspond to the edge groups, by which the edges are upgraded. During the upgrade the status reveals that a new operating system is installed on these:
When all edges are upgraded you can contine to the next step by clicking on “Next”:
With the NSX-T 2.4 upgrade the controller functionality was moved from the dedicated controller VMs to the manager, which was in turn changed from a single VM to a cluster, the fourth step is obsolete and can be skipped by clicking on “Next”:
The upgrade of the NSX-T manager cluster should be communicated to concerned parties (e.g. network admins) as functionality will not be available during the maintenance window:
The three manager VMs are upgraded in parallel:
By clicking on “More information” the detailed upgrade logs are displayed:
After completing the upgrade the manager VMs are rebooted. Until the services are available again this message is displayed:
With the management nodes being upgraded successfully the upgrade process is completed:
The upgrade history can be tracked by clicking on “Show Upgrade History”:
Two weeks ago the latest and greatest in VMware’s SDDC came out: VMware Cloud Foundation (VCF) version 3.7. Apart from including the current security patches (e.g. ESXi 6.7 EP 06 / build number 11675023) a couple of new automation features have been added, as you can see in the release notes. Also the cloud builder for setting up greenfield Cloud Foundation and VMware Validated Design deployments have been merged. As you can see in my tweet from a while back this used to be to separate OVA files.
VCF 3.7 can be installed as a new deployment or upgraded from the previous version (3.5.1), which is what I did to a dark site I am maintaining. The process is the same as in my previous posts:
After downloading the bundle files (around 21 GB) on a PC with internet access and importing those into the SDDC manager you can trigger the first phase of the update process, which is updating the SDDC manager itself:
This took less than 22 minutes on current Dell EMC hardware:
The new build numbers are 12695026 / 12695044 (UI).
After triggering the next update phase the vCenter and PSC instances are bumped from build number 10244745 to 11726888, which is the most current security update available:
The last step is upgrading the ESXi hosts to build number 11675023 which was released on 01/17/2019. Only recently (03/28/2019) a more current security patch was released, which will presumably be included in one of the future VCF upgrades.
Having all VCF 3.7 patches installed is confirmed by the displayed text “There is no update available”.
If you set up a VMware Cloud Foundation (VCF) deployment you will notice all components (SDDC manager, vCenter, Platform Service Controllers, NSX manager & vRealize Log Insight) are using self-signed SSL certificates for their web services. If you have a Microsoft Active Directory server or cluster you can use their Certificate Authority (CA) functionality to generate trusted certificates as described in the official documentation. However there is an alternative if you are not willing to setup Microsoft servers or pay their license fees. You can create your own certificates by your internally trusted CA and let SDDC manager do the work of distributing them among the various VCF components.
In this example based on the corresponding documentation page I will use the freeware software XCA, which is a graphical frontend to create and manage X.509 certificates. It is available for Windows, macOS and Linux.
When you have downloaded and installed the software and are opening it for the first time, you need to create a new database (see “File” menu) as a starting point. It will ask you for a filename and a password which you need to enter each time you are accessing the database. Also you should set the default hash algorithm to “SHA256” in the options menu, as “SHA 1” is deprecated.
In the simplest case you would create a CA by hitting “New Certificate”, selecting the “CA” template (followed by “Apply all”), giving it at least a name (Internal name, commonName) and generating a private key for it.
In my case however I already had a CA up and running elsewhere, which I used to create an intermediate CA called “xca”. To be able to use that to create certificates in the XCA tool I first had to import the already created private key:
Then I imported both the certificates of the root and intermediate CA:
If you are not using a self created CA as described above you need to select the externally created root CA and click on “Trust” in the context menu: (the intermediate CA is then trusted automatically)
Now it was time to generate the certificate signing requests using the SDDC manager interface. Select all resources you want their certificates to replace and click on the “Generate CSR” button: (found under the “Security” tab of your workload/management domain)
This will let you download an tar.gz archive named like your workload domain. So for the management domain it is called “MGMT.tar.gz”. Extract that archive with your favorite tool, e.g. using “tar -xzf MGMT.tar.gz” for *nix. For Windows desktops 7-Zip is working fine, although you might need to extract in two steps (.tar.gz -> .tar -> extract contents).
After extraction you should have a folder also named like your workload domain with sub-directories named like the hostnames of your VCF components, containing a .csr file each. Import those in the “Certificate signing requests” tab in XCA using the “Import” button:
Pick a CSR, open the context menu and click on “Sign”:
The following window will appear. Make sure that the correct root or intermediate CA is selected for “Use this Certificate for signing”, and that a supported hash algorithm like “SHA 256” is selected: (ignore the Template selection)
In the next tab you can enter the time range the certificate will be valid. After entering a number you need to hit the “Apply” button. As all other important settings are already filled out from the CSR no further modifications are needed. Maybe the “X509v3 Subject Alternative Name” (SAN) field would be a good idea to fill out with the respective FQDNs and IP addresses (I will explain later on why).
After having the signing procedure repeated for all CSRs the “Certificate” tab of XCA should look like the next screenshot. Here you need to export the created certificates to the same folders you imported the CSRs from with the same filename (with file extension “.crt”). Also make sure the export format is set to “PEM”:
You also need to export a certificate chain of the trusted CAs to file called “rootca.crt” placed in the extracted directory where the other sub-directories are located. This can be done with XCA as shown below:
For the SDDC manager to be able to import the certificate structure (including the previously exported CSRs) the folder structure needs to be in an tar.gz archive once again. You might need to delete the old archive downloaded previously as the same name is used. In *nix use “tar -czf MGMT.tar.gz MGMT/”. Using 7-Zip it is again a two step procedure. First add the folder to a tar archive like this:
Then add the tar archive to a gzip archive using the default settings:
The resulting tar.gz file can then be uploaded in the menu opening after clicking on “Upload and install”:
If everything is done correctly the result should look like this:
All services except for the SDDC manager are restarted automatically, but you may need to close browser sessions if you still have old ones open or even clean your browsing cache. If you do not want to reboot your SDDC manager use SSH and the “vcf” user to log into it and run the following commands:
su sh /opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh
Of course you still need to import your locally created CA into the trusted folder of your browser of choice so that it show as “valid” HTTPS. This howto should help to accomplish this. In the end it should look like this:
One issue I found was that the connection between vCenter and NSX manager was no longer working with the new certificates. Searching the symptoms (vCenter displaying “No NSX Managers available. Verify current user has role assigned on NSX Manager.”) in the VMware knowledge base led me check the lookup/registration page of the NSX manager appliance. It appears that Cloud Foundation sets up both URLs using their IP addresses. After changing both to the PSC/vCenter FQDNs (as shown in below screenshot) and restarting the VMs everything was working again:
Another solution to solve this could be to add the IP addresses of each VCF components into the individual “SAN” field when creating the certificates, as described above, so that the HTTPS connection is trusted in both ways.
After just getting started with PowerCLI on my company Windows 10 notebook I read that you could also run it on Linux and MacOS systems since last year. As I just started to like the functionality (took some time when you are only accustomed to Bash and Python) I wanted to give that a try on my private Macbook Pro, so here are the steps I took:
First download the latest stable release for MacOS (shown above), currently “powershell-6.1.2-osx-x64.pkg“, and install it. Then open a shell, either by clicking on “PowerShell” in the Launchpad or open a Terminal window and enter “pwsh”.
If you skip the first line the PSGallery repository, which hosts the PowerCLI packages, is not trusted, resulting in the following warning:
You are installing the modules from an untrusted repository. If you trust this repository, change its InstallationPolicy value by running the Set-PSRepository cmdlet. Are you
sure you want to install the modules from 'PSGallery'?
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is "N"):
When a VMware Cloud Foundation deployment was updated to the current version, as described previously, a few tasks should be done afterwards. First the vSAN datastore disk format version might need an upgrade. To check this head to the “Configure” tab of your DC in vCenter and click on “vSAN /Disk Management”:
Of course you should run the pre-check by clicking on the right button. If everything is working as it should it would look like this:
Now you can click the “Upgrade” button, which informs you this can take a while. Also you should backup your data/VMs elsewhere, especially if you select “Allow Reduced Redundancy”, which speeds up the process:
As you can see now the disk format version has changed from “5” to “7”:
However still some vSAN issues are displayed:
As this deployment is a “dark site”, meaning no internet access is available, the HCL database and Release catalog have to be updated manually.
The URL to download the 14.7 MB file can be found in a post from William Lam from 2015 or in this KB article. The release catalog’s URL is taken from another KB article. This file is less than 8 KB in size. After uploading both using the corresponding “Update from file” buttons the screen should look like this:
The last remaining issue in this case was the firmware version of the host bus adapter connecting the vSAN datastore devices could not be retrieved (“N/A”):
Since the firmware version listed in the hosts iDRAC (see next screenshot) matches one of the “Recommended firmwares” from above I decided to rather hit “Silence alert”. Eventually one could look for an updated VIB file allowing the ESXi host to retrieve the firmware version from the controller.
One more effect of the upgrade from 188.8.131.52 to 3.5 is the appearance of three more VMs in vCenter. These are the old (6.5.x) instances of the platform service controllers and the vCenter. New instances with version 6.7.x have been deployed during the upgrade. After all settings had been imported from the old ones, these were apparently powered off and kept in case something would have gone wrong. After a period of time and confirming everything works as expected those three VMs may be deleted from the datastore:
In this post I would like to show you the process of updating a VCF deployment at a customer site to the current version which was released in mid december. The pictures show only the update of the management workload domain, as that is the only one currently available there. If you have multiple VI/VDI workload domain still you have to update the management domain first, and then individual workload domains.
The steps necessary are the same as in previous updates. E.g. if you are located at an environment isolated from the internet you can use a laptop to download the bundle files based on a delta file provided by the SDDC manager and import these afterwards, as described in one of my previous posts. The update itself can be scheduled or started immediately. The process is the same as before, but consists of multiple phases.
The first phase updates the VCF services itself, including domain manager, SDDC manager UI and LCM:
Afterwards the NSX components are updated to version 6.4.4 as shown in below screenshot:
In the next phase the platform service controllers (for the management domain typically two) and the vCenter are updated to version 6.7. Sadly in the first release of the VCF 3.5 update bundles there was a but resulting in an error in the stage “vCenter upgrade create input spec”:
The SDDC manager’s log file “/var/log/vmware/vcf/lcm/lcm-debug.log” only showed a “java.lang.NullPointerException” error at the component “com.vmware.evo.sddc.lcm.orch.PrimitiveService“, which didn’t help me much, so after a unlucky Google search I contacted VMware’s support. Upon opening a support case on my.vmware.com a very friendly Senior Technical Support Engineer got back to me within minutes and pointed my attention to this knowledge base article. Apparently the issue cannot be fixed in place, but a new update bundle is available replacing the buggy one. If your SDDC manager has internet access it can download the bundle automatically, but if you are at a “dark site” you first need to get rid of the faulty bundle’s id by running the following python script:
Afterwards a new marker file has to be created and transferred to a workstation with internet access where the updated bundles are downloaded: (same procedure as described before)
This screen shows the new successful import of the previously downloaded bundles after copying it back to the SDDC manager:
Finally we can retry the phase 3. As you can see here a new screen appears now:
As vCenter appliances can not be upgraded from 6.5 to 6.7 directly a new appliance has to be deployed which then imports all settings from the old one. To be able to complete this process the SDDC manager needs a temporary IP address for that new appliance in the same range as the vCenter/PSC:
Check the review screen to confirm the temporary IP settings and hit “Finish” to start the update:
Hooray! The update process did not fail at the stage it did before:
After a little more than an hour all three appliances are up-to-date:
As we can see in the overview screen of the domain all components are updated, except for the ESXi hosts:
This means the fourth and final phase can be started: the update of the ESXi hosts to build 1076412:
This concludes the update to VCF 3.5 as all components now have the current build numbers:
The next screenshot of the Update history section shows the update from 3.0.1 to 184.108.40.206 and the four updates from above:
After deploying vROPS using the vRSLCM yesterday, today the task was to deploy two separate instances of vRealize Log Insight. Both instances should consist of a cluster of one master and three workers (deployment type “Medium with HA”) and be placed on different hypervisor clusters, each managed by their own vCenter and separated by a third-party firewall. Finally the “outer” vRLI cluster would forward their received telemetry onto the “inner” cluster, which will function as part of a central SIEM platform.
The first step is to deploy both of the clusters. Again the “Create Environment” screen is used:
After being finished with entering all the deployment parameters the pre-check is performed, but failed. Allegedly the IP addresses provided could not be resolved. Correctly configured Active Directory servers with the according A- and (reverse) PTR-entries were set up and reachable, so the warnings were ignored:
The environment creation is initiated:
After deploying the master the three workers are deployed in parallel:
After deploying the three workers the LCM fails to configure the supplied NTP servers for some reason:
At this point you have two options. The first one being deleting the environment (including the VMs by the below checkbox) and starting over: (e.g. if you actually made a mistake)
The other option is to resume the request: (The arrow on the right already disappeared after clicking so I drew one where it was)
This time the step and eventually the entire request finished successfully. From the vCenter perspective the result will look like this:
This process is repeated for the second cluster / environment, leaving us with two environments, each with a vRealize Log Insight cluster:
The next step is to set up message forwarding, so that the “inner” cluster will receive also the messages from the devices logging to the “outer” cluster, with only allowing SSL secured traffic from that cluster to the other on the firewall between the clusters. Before configuring the two vRLI clusters we first need to export the certificate for the “inner” cluster, which was created separately using the vRSLCM: (If the same certificate is used for both environments, e.g. subject alternative name=*.”parent.domain”, you can skip this)
The receiving (“inner”) cluster can be configured to accept only SSL encrypted traffic: (optionally)
Finally the FQDN for the virtual IP of the the “inner” cluster is added as event forwarding destination in the configuration page of the “outer” cluster. The protocol drop-down should be left on “Ingestion API” as changing to “Syslog” will overwrite the original source IPs of the logging entries. After checking the “Use SSL” box verify the connection by using the “Test” button:
If no filters are added here all events received by that vRLI cluster will also be available on the other one.
For testing the setup I configured a NSX-T manager, placed at the “inner” management cluster, to log directly onto the “inner” cluster and a couple of edge VMs, which were deployed to the “outer” edge cluster, as described here.
In my previous post I described how to deploy the vRealize Lifecycle Manager 2.0 and import product binaries and patches. Now it is time to make use of it to deploy the first vRealize product: vRealize Operations Manager. There are some more steps, which you need to complete first, like generating a certificate or certificate signing request, and also some optional tasks, like adding an identity manager or Active Directory association. As they are described quite well in the official documentation I will skip those here.
Before you can add an environment (the term used for deploying vRealize products) a vCenter has to be added. The documentation states how to add a user with only the necessary roles, but for testing purposes you can also use the default administrator SSO account.
If you have an isolated environment the request to add a vCenter will look like the above screenshot, as it can’t get patches from the internet, but it will still work. In the “Create Environment” screen you can select which products you want to deploy. For each product you need to select the version and the deployment type:
Next to the deployment type each product has a small “info” icon. Upon clicking that the details to each type are displayed:
After selecting your desired products you have to accept the license agreements and fill in details like license keys, deployment options, IP addresses, host names etc.
After putting in all necessary information a pre-check is performed:
The pre-check verifies the availability of your DNS servers, datastores and so on:
After submitting the LCM creates the environment according to your input:
As I made a mistake in the DNS server configuration the request failed.
Upon clicking “View Request Details” a more detailed view is presented. (see screenshot below) Before deleting the environment and giving it another shot after having the mistake fixed you should export the configuration. Two options are offered: Simple or Advanced. I picked simple, which lets you download most of the parameters you entered as a JSON file.
The red info icon in the lower left corner gives even more details. In my case the successfully deployed master node was not reachable because of the DNS misconfiguration mentioned above.
In the “Create Environment” screen you can paste the contents of the saved JSON file (see above) to speed up the process. This brings you directly to the pre-check step. However you still need to go back one step and select your NTP servers – this doesn’t seem to be included in the JSON configuration. While the environment creation request is in progress you can also see details:
Finally the request finished successfully. Some steps were left out, probably because this is a single node deployment and not a “real” cluster…
After the environment is created you can (and should) enable health checks via the menu which open when you click the three dots in the upper right corner of the request box. This menu also offers you to download logs and export the configuration, as done before.
The first task I am going to do with the newly deployed vROPS is to install the HF3 security fix imported earlier:
Just select the patch, click “Next” to review and install:
You can monitor the patch installation progress:
To be able to use the integrated Content Management you have to configure the environment as an endpoint. Just click the link “Edit” which appears when clicking on the three dots next to the list element:
First confirm or modify the credentials entered earlier and test the connection:
Finally you have four checkboxes to selecht your desired Policy Settings:
I will pick up the Content Management section in another blog post. Up until then the vROPS deployed using the vRealize Suite LCM can be used as usual by opening the web GUI. It asks you to set your currency (can’t be modified later on!) and is ready to fill its dashboards with data as soon as you configure the parameters and credentials for the solutions you want to monitor, e.g. vCenter: