Welcome back to part two of our deep dive into upgrading VMware Cloud Foundation. In our previous session, we kicked off the process, and today we are focusing on a critical milestone: upgrading the SDDC Manager to version 9.1 and deploy new components
This isn’t just a simple version bump. VMware is shifting many components from traditional virtual machines to a container-based Kubernetes architecture, which introduces some new configuration steps—and a few potential pitfalls.
Here is a step-by-step breakdown for handling the SDDC Manager upgrade, downloading the new binaries, and navigating the architectural changes.
Phase 1: Pre-Checks and the Initial Upgrade
To start, you will want to log into the VCF Operations Portal. While you can initiate the process directly from the SDDC Manager Lifecycle Management tab, I recommend sticking to the Operations Portal to ensure consistency.
- Navigate to Lifecycle Management: Select the update for version 9.1.
- Run the Pre-checks: Never skip this step. The pre-check will scan your environment to ensure there are no pending backups, stuck processes, or configuration issues that could halt the upgrade.
- Execute the Update: Once the pre-check comes back clean, start the upgrade.
After a few minutes, the Operations portal, License Server, and SDDC Manager will show as successfully updated to 9.1. But the work doesn’t stop here.
Phase 2: Downloading the New Binaries
Because VCF 9.1 introduces and modifies several architectural components, your next stop is the Binary Downloads section. You need to pull down the binaries for the new containerized ecosystem.
Make sure you download the packages for:
- Identity Broker
- Telemetry & License Server
- Salt Master & RaaS
- SDDC LifeCycle
- Cloud Proxy
- Fleet Lifecycle & Software Depot
Important Note: Do not forget to download the Service Runtime binaries. If you miss these (like I initially did in the lab!), the deployment will fail later on.
Phase 3: Component Configuration & The CIDR “Gotcha”
Once the binaries are downloaded, it is time to configure the pod and install the components. The system will prompt you for the Operations FQDN, your cloud operations password (which will require you to accept the thumbprint), and the FQDNs for the new services, such as the Fleet and Identity Broker.
Then comes the tricky part: The VCF Service CIDR.
Because VCF is moving these services to Kubernetes, it requires IP space for the pods. During my upgrade, I hit a roadblock that isn’t clearly explained in the UI. This definitely will be improved in the next patch
Troubleshooting the Subnet Configuration
Initially, I attempted to assign an entirely new /24 subnet (e.g., 192.168.240.0/24) for the runtime services. The deployment threw an error:
“Not all addresses from the VCF service IP pools are part of the management network.”
It also threw a reverse DNS lookup failure. Here is how to fix both:
- Fix Your DNS: Ensure your A records and reverse lookups are correctly populated for all the new FQDNs you just defined.
- The CIDR Solution: If you are deploying a Greenfield environment, VCF asks for an IP range. However, during an upgrade, it asks for a CIDR block. Do not create a new subnet. Instead, carve out a block of IPs from your existing managed subnet. In my case, I utilized my
192.168.200.xmanagement network and provided a/27CIDR, giving the Kubernetes cluster 16 usable IPs on the existing management network.
Once I updated the DNS entries and corrected the CIDR block to live on the management network, the pre-checks passed, and the installation proceeded.
Phase 4: The Bootstrap and Container Deployment
With the networking sorted, the real upgrade takes over.
- The Bootstrap VM: VCF will first deploy a temporary template called the Bootstrap VM.
- Containerization: This Bootstrap VM acts as the orchestrator, spinning up the Kubernetes cluster and starting the containers for all the services we downloaded earlier (Runtime, Fleet Lifecycle, Salt Master, Software Depot).
- Clean Up: Once the container-based architecture is successfully running, the system automatically deletes the temporary Bootstrap VM. You will then see new virtual machines deployed, such as the
vcf9-runtimenodes, which will handle these services moving forward.
Expect this automated deployment phase to take anywhere from an hour to an hour and a half.
What’s Next?
We now have the SDDC Manager fully upgraded to 9.1, running on the new containerized architecture!
In Part 3 of this series, we will tackle the VCF Automation upgrade. Following that, in Part 4, we will finally upgrade the management cluster itself—bringing our NSX, vCenter, and ESXi hosts from 9.0.2 up to 9.1.
Stay tuned, and I will see you in the next video!