Fixing expired NSX self-signed certificates after upgrading from NSX-T 3.2.x to NSX 4.x (VCF 5.1)

After upgrading a VCF 4.5 management domain to 5.1 for a customer, upgrading NSX to v4.1.2.1.0 in the process, we saw that NSX suddenly listed a lot of expired certificates, which wasn’t there before (actually they were, just not showing up and there are no alerts for those in NSX 3.x).

Troubleshooting this, I came over VMware KB94898, which helped me fix this. It states:

NSX Managers have many certificates for internal services. In version NSX 3.2.1, Cluster Boot Manager (CBM) service certificates were incorrectly given a validity period of 825 days instead of 100 years.
This was corrected to 100 years in NSX 3.2.3.
However any environment originally installed on NSX 3.2.1 will have the internal CBM Corfu certs expire after 825 regardless of upgrade to the fixed version or not.

On NSX-T 3.2.x internal server certificates could expire and no alarm would trigger. There was no functional impact.
Starting from NSX 4.1.0.2, NSX alarms now monitor validity of internal certificates and will trigger for expired or soon to expire certificates.
Note on NSX 4.1.x, there is currently no functional impact when an internal certificate expires however alarms will continue to trigger.

– VMware KB94898

The KB provides a script to rectify this. It replaces all the certificates (except API and cluster certificates) in NSX and gives all of them a validity of 99 years. I ran this script from a linux client (or you could use wsl on Windows) and this fixed all of our 18 alarms (6 certs on each node). Please note it will restart the manager nodes, one by one, and you should not make any changes to NSX during this time. Also recommended to take a backup first of course.

Full output (showing here with a fake IP):

user@linux:~$ python3 replace_certs.py


*************************************************************
* This script will replace all certificates (except API and *
* cluster certificates) in an NSX unified appliance cluster *
* with certificates of validities of 99 years. New certifi- *
* cates will retain all properties of the old certificates. *
* *
* Estimated Execution time is 30 minutes. During this time *
* period do not make any configuration or update to the *
* system. *
* *
* It is highly recommended to backup the system before *
* certificate replacements. *
*************************************************************

Do you want to continue (Yes/No)? yes
Have you backed up your system? (Yes/No) yes
Enter the virtual IP address (VIP) of the cluster: 10.333.333.30
Enter the cluster's admin password (will not be displayed):
To confirm, re-enter the password (will not be displayed):

You are about to replace the certificates in cluster 10.333.333.30
Do you want to continue: (Yes/No)yes
Replacing APH certificate in node 91432a42-...
Replacing CBM_MESSAGING_MANAGER certificate in node 91432a42-...
Replacing CBM_UPGRADE_COORDINATOR certificate in node 91432a42-...
Replacing CBM_CORFU certificate in node 91432a42-...
Replacing CBM_MP certificate in node 91432a42-...
Replacing CCP certificate in node 91432a42-...
Replacing APH_TN certificate in node 91432a42-...
Replacing CBM_SITE_MANAGER certificate in node 91432a42-...
Replacing CBM_CCP certificate in node 91432a42-...
Replacing CBM_MONITORING certificate in node 91432a42-...
Replacing CBM_IDPS_REPORTING certificate in node 91432a42-...
Replacing CBM_CM_INVENTORY certificate in node 91432a42-...
Replacing CBM_AR certificate in node 91432a42-...
Replacing CBM_CLUSTER_MANAGER certificate in node 91432a42-...
Replacing APH certificate in node 9e8b2a42-...
Replacing CBM_MESSAGING_MANAGER certificate in node 9e8b2a42-...
Replacing CBM_UPGRADE_COORDINATOR certificate in node 9e8b2a42-...
Replacing CBM_CORFU certificate in node 9e8b2a42-...
Replacing CBM_MP certificate in node 9e8b2a42-...
Replacing CCP certificate in node 9e8b2a42-...
Replacing APH_TN certificate in node 9e8b2a42-...
Replacing CBM_SITE_MANAGER certificate in node 9e8b2a42-...
Replacing CBM_CCP certificate in node 9e8b2a42-...
Replacing CBM_MONITORING certificate in node 9e8b2a42-...
Replacing CBM_IDPS_REPORTING certificate in node 9e8b2a42-...
Replacing CBM_CM_INVENTORY certificate in node 9e8b2a42-...
Replacing CBM_AR certificate in node 9e8b2a42-...
Replacing CBM_CLUSTER_MANAGER certificate in node 9e8b2a42-...
Replacing APH certificate in node ecf72a42-...
Replacing CBM_MESSAGING_MANAGER certificate in node ecf72a42-...
Replacing CBM_UPGRADE_COORDINATOR certificate in node ecf72a42-...
Replacing CBM_CORFU certificate in node ecf72a42-...
Replacing CBM_MP certificate in node ecf72a42-...
Replacing CCP certificate in node ecf72a42-...
Replacing APH_TN certificate in node ecf72a42-...
Replacing CBM_SITE_MANAGER certificate in node ecf72a42-...
Replacing CBM_CCP certificate in node ecf72a42-...
Replacing CBM_MONITORING certificate in node ecf72a42-...
Replacing CBM_IDPS_REPORTING certificate in node ecf72a42-...
Replacing CBM_CM_INVENTORY certificate in node ecf72a42-...
Replacing CBM_AR certificate in node ecf72a42-...
Replacing CBM_CLUSTER_MANAGER certificate in node ecf72a42-...

This script also removed/deleted the old expired certificates , and all alerts for this problem was resolved automatically.

Leave a comment