VMware SSO 5.5 in HA mode

Share Button

As everyone knows, in vSphere 5.1, VMware introduced the SSO Service. This service is designed to handle every authentication requests, coming from more and more VMware’s services. Unfortunately, in 5.1, SSO was not a great pleasure to administrate.

To solve many problem of SSO 5.1, VMware released in September 2013, version 5.5 of the vSphere Suite. This version really solved many problems of SSO and also introduced some new features : Multi Master mode and Site management (to list just 2 of them).

Involved in a big vCloud deployment, we decided to deploy version 5.5 of SSO, because upgrading from SSO 5.1 to SSO 5.5 is not a real pleasure (good advise : do not start this upgrade 1h before an important meeting or going back home :)).

Anyway, every other component will be kept in version 5.1. We made this choice, because version 5.5 has not been well tested (at the time of writing this article).
We have also choose to deploy SSO 5.5 with a load balancer. And, to be clear, this is not like holidays : no docs (for the moment), only 1 KB article explaining with some vagneness and imprecision, well no real articles to deploy it in a supported way. The last thing to note is that we decided to deploy 1 “SSO Domain” with multiple sites.

To be clear, even if this article was written using a mixed 5.1 and 5.5 environment, this article should work with a 5.5 only environment.

 

Requirements

So, the component involved here are:

  • 1x vCenter Server 5.1, also called Management vCenter
  • 1x ESXi 5.1
  • 1x vShield Manager 5.1
  • vShield Edge, HA if possible but not required
  • At least 2x Windows Servers to support VMware SSO 5.5

 

Prepare the environment

We will be using 1x Windows Server to deploy the Management vCenter. The Management vCenter must contain all component but with different versions, ie:

  • SSO 5.5
  • Inventory Services 5.1
  • vCenter Server 5.1
  • vSphere Web Client 5.5 (required with SSO 5.5 and to be supported by VMware)

At first, we will deploy SSO 5.5 on the Windows Server. This is a standard deployment. 2 things to keep in mind while deploying :

  • Use Option 1 to deploy it : that means : Deploy the SSO service in a new environment with no existing site
  • We will call this site as “Site-A”

Next, Inventory Service 5.1 and vCenter Server 5.1 will be deployed ; No spefic option to note.

Finally, vSphere Web Client 5.5 can be deployed (in fact, it may deployed before Inv.Service and vCenter Server but it doesn’t matter ;)).

At this point, the ESXi 5.1 host can be registered with the vCenter Server.

Next, vShield Manager 5.1 can be deployed on this vCenter. Register it with the SSO Service and the vCenter. The next step will be to deploy a vShield Edge (HA if it matters for you). At this point, the network environment should be prepared like this :

  • VLAN 1 : Will be the public VLAN. It will host the vIP used to address the SSO Service
  • VLAN 2 : Will be the Private VLAN. It will host the SSO Server

 

Deploy vShield Edge

So, the second step is to deploy vShield Edge. Using HA mode for this Edge is a best practice for availability but is not required. I will not detail how to deploy a vShield Edge instance (Maybe, I will make in the future an article stating how to deploy an Edge device).

Considering the network configuration, the “One Arm” setting is not supported by VMware. So, at least 2 interface must be configured, 1 as an Uplink and 1 as an Internal. The Uplink Interface will be connected on the Public VLAN (1). The Internal Interface will be connected on the Internal VLAN (2).

It is required “that soon” because it is the only gateway that will exist between VLAN 2 and VLAN 1.

At this step, the default gateway of VLAN 1, used to connect the other network in the corporation should be configured with a static route to be able to address VLAN 2, via  the Edge Uplink interface (VLAN 1).

The final point is to open ports to allow discussion between the existing SSO Service (on the Management vCenter) and the new SSO Servers. I will strongly suggest to open all ports on the Edge (ie Set default policy to “Allow”) but you may choose to open only required ports (7444 at least, maybe all Windows’ required one – 389, 53, …). You can change afterwards the default policy of the Edge.

 

Install VMware SSO Servers

I will not detail how to install SSO Service using the “Existing domain” method. You will find this everywhere with a good Google request.

But, there’s 2 or 3 thing to remember while deploying :

  • You may integrate with the SSO Service of the Management vCenter, but as a new site (New site is Mandatory obviously). You may also choose not to integrate with the SSO Service of the Management vCenter. I prefer integration, as Multi-Master mode greatly simplifies day-to-day operations.
  • The first SSO node will be deployed using the third option “SSO as a new site in an existing deployment”. We will call it “Site-B”
  • The second SSO node will be deployed using the second option “SSO node in an existing site” (Site-B if you’re lost :))
  • If the Windows servers are hosted in a DRS cluster, a best practise is to create an “Anti-Affinity rule” to prevent these 2 VMs to run on the same host

Considering network, they will be connected on VLAN 2, the Private VLAN. Their default Gateway will be the Internal Interface of the vShield Edge.

 

Prepare Edge device for Load Balancing

We will then deploy the Load Balancing Service on the Edge device. To proceed, just follow these steps:

  1. On the Edge device, go to the Load Balancer tab
  2. Click on the “Green + Sign” and Name the Pool
    (in the next step, called SSO-Pool)
  3. Activate Service HTTPS, with “Balancing Method” set on LEAST_CONN and port 7444
  4. On the “Heatch Check” window, activate HTTPS on port 7444 in mode TCP. You may stay with other parameters with the default values
  5. On the next window “Add member”, add you 2 nodes IP address (not their name), with port 7444 for both “Port column” and “Monitor Port” (default values)
  6. Review settings and Finish

Next, we have to enable a virtual server interacting with this pool. Proceed with the following steps :

  1. On the Edge device, go to the Load Balancer tab
  2. Click on the “Green + Sign”, click on the “Virtual Servers” link
  3. Name the virtual server. This is the Network name of the Load Balancer. Use the short name, not the FQDN). We will use the name “vip-sso”
  4. Enter a short description if you want (always a best practise :))
  5. Enter the IP Address of the Load Balancer
  6. Choose the SSO Pool
  7. Tick the line of HTTPS, choose port 7444 and “Persistence Method” SSL_METHOD_ID
  8. Tick “Enabled” and Save
  9. When the Windows is closed, Go back to Pools, using the link near “Virtual Servers” and click “Enable” near “Load Balancer Service Status” (if already activated, disable and enable is required)

You can test your configuration by :

  • Checking the “Service Status” in the Load balancer tab of the Edge device (should be “Up”)
  • Using your web browser and going to URL : https://vip-sso.fqdn.com:7444/lookupservice/sdk

At this point, we have an Edge device configured with Load Balancing

Configure SSO Certificates

At this point, it is required to deploy certificates. Why ? When will reconfigure the SSO Service, we will be required to provide the “.pem” file. Another point is that, when the SSO Service is reconfigured, it will look for a certificate that must allow these Common Name and Subject Alt Name:

  • The vIP FQDN (Prefer this one for the Common Name – Not required but it is more “readable”)
  • The vIP Short Name
  • The SSO Server 1 FQDN
  • The SSO Server 1 Short Name
  • The SSO Server 2 FQDN
  • The SSO Server 2 Short Name

All of these are required, because when “clustered”, the SSO Server must :

  • Speak as the vIP
  • Share the same certificate for all registered service to recognize the SSO Service
  • Know their own name

The SSL Automation Tool may be used to update these certificates.

At this point, we got :

  • vShield Edge used to route the traffic from the Public VLAN (1) to the Private VLAN (2) and configured to load balance the SSO Service on port 7444
  • 2x SSO Service integrated with updated certificates with all CN & SAN configured

 

Alter SSO Service configuration

This is the final point !! But this is also really the hardest. It is based on VMware KB article 2058838.

Edit the Catalina configuration files

This modification must be done on all SSO nodes :

  1. Open Windows Explorer and Go to directory “C:\ProgramData\VMware\cis\runtime\VMwareSTS\conf\” (default directory)
    Please note that this directory is hidden (in default configuration of Windows Explorer), so you have to manually enter “C:\ProgramData” in Windows Explorer to enter this directory))
  2. Open “Server.xml” file
  3. On line “<Engine defaultHost=”localhost” name=”Catalina”>”, add at the end jvmRoute=”nodeX“>, where X is 1 for the first edited node, 2 is for the second edited node, and so on
  4. Using the services.msc console, restart the STS Service

All of this is correctly stated in VMware’s KB article.

 

Prepare the configuration files for SSO Service

This is the point where VMware’s KB article lacks some information. We have to edit 6 files :

  • sts_id : contains the id of the STS Service in the SSO Service of Site-B
  • sts.properties : contains the new configuration of the STS Service
  • admin_id  : contains the id of the STS Service in the Admin Service of Site-B
  • admin.properties : contains the new configuration of the Admin Service
  • gc_id : contains the id of the STS Service in the Group Check Service of Site-B
  • gc.properties : contains the new configuration of the Group Check Service

The article do not specify that these files simply do not exist. In fact, we have to create them. So :

  1. Create a new directory with … Windows Explorer : C:\updateInfo
  2. Create all these 6 files

Open cmd.exe on SSO1 Server, launch :

C:\Program Files\VMware\Infrastructure\CIS\vmware-sso\ssolscli.cmd  listServices https://SSO1.FQDN.DOM:7444/lookupservice/sdk

In the result, all information required to fill sts_id, admin_id, gc_id are available. The information can be found on line starting with ServiceId=. For example :

Site-B:a03772af-b7db-4629-ac88-ba677516e2b1

Just fill these 3 files with the matching lines (1 line for each service/file).

Next step is filling “.properties” files. In each sections below, the text in bold must be modified to reflect your configuration. So, for “sts.properties“, the content should match this pattern : (Replace the bold section to match your setup)

[service]
friendlyName=The security token service interface of the SSO server
version=1.5
ownerId=
type=urn:sso:sts
description=The security token service interface of the SSO server
productId=product:sso
viSite=Site-B

[endpoint0]
uri=https://vip-sso.fqdn.com:7444/sts/STSService/vsphere.local
ssl=C:\updateInfo\Chain.pem
protocol=wsTrust

For “admin.properties“, the content should match this pattern:

[service]
friendlyName=The administrative interface of the SSO server
version=1.5
ownerId=
type=urn:sso:admin
description=The administrative interface of the SSO server
productId=product:sso
viSite=Site-B

[endpoint0]
uri=https://vip-sso.fqdn.com:7444/sso-adminserver/sdk/vsphere.local
ssl=C:\updateInfo\Chain.pem
protocol=vmomi

And, for “gc.properties“, the content should match this pattern:

[service]
friendlyName=The group check interface of the SSO server
version=1.5
ownerId=
type=urn:sso:groupcheck
description=The group check interface of the SSO server
productId=product:sso
viSite=Site-B

[endpoint0]
uri=https://vip-sso.fqdn.com:7444/sso-adminserver/sdk/vsphere.local
ssl=C:\updateInfo\Chain.pem
protocol=vmomi

 

Reconfigure the services

The next point is to alter the configuration of Site-B SSO Service. These commands must only be submitted once. Just choose one node out your 2, and proceed (in a cmd.exe) : (Replace the bold section to match your setup)

cd C:\Program Files\VMware\Infrastructure\CIS\vmware-sso\
ssolscli.cmd updateService -d https://sso1.fqdn.com:7444/lookupservice/sdk -u [email protected] -p <YOURPASSWORD> -si C:\updateInfo\sts_id -ip C:\updateInfo\sts.properties

This will reconfigure the STS Service of this SSO Site, ie configuration will be applied to all nodes members of this site. Please note that ssolscli.cmd is case sensitive : updateservice is not updateService. The result can be check using command :

ssolscli.cmd listServices https://vip-sso.fqdn.com:7444/lookupservice/sdk

You will see that the Endpoint of the STS Service is set on vip and the other are still on the SSO node name. Next, we will reconfigure Admin Service and Group Check Service. Please note that there is an error in the VMware’s KB Article. To reconfigure the Admin Service, submit :

ssolscli.cmd updateService -d https://vip-sso.fqdn.com:7444/lookupservice/sdk -u [email protected] -p <YOURPASSWORD> -si C:\updateInfo\admin_id -ip C:\updateInfo\admin.properties

At this point, using ssolscli.cmd listService with any URL (vip-sso or sso1) command will return an error.

To reconfigure the Group Check Service, submit :

ssolscli.cmd updateService -d https://vip-sso.fqdn.com:7444/lookupservice/sdk -u [email protected] -p <YOURPASSWORD> -si C:\updateInfo\gc_id -ip C:\updateInfo\gc.properties

After that, ssolscli listService is once again available. But, you must not try to register any service at this point with the SSO.

Finalize

Restart both SSO Node 1 & SSO Node 2. The KB article specify that the SSO Edge device must be rebooted but, afterall, this does not seem to be required.

 

Test it and Enjoy

This is the final step : Try it and Enjoy.

Multiple methods can be used to try it, but here are some examples:

  1. Open your web browser and go to : “https://vip-sso.fqdn.com:7444/lookupservice/sdk ==> Success (I hope)
  2. Stop the first node, and try again at the same URL ==> Success (I hope)
  3. Restart the first node, Stop the second node, and try again at the same URL ==> Sucess (I still hope)
  4. And the final step : Register a service 🙂

I do hope this article helped you. If so, you may want to share it on some social networks.

 

 29,822 total views,  2 views today

Share Button

3 Replies to “VMware SSO 5.5 in HA mode”

  1. Thanks for your comment.
    Unfortunatly, I was unable to reproduce this problem with firefox. You may send me a screenshot (via twitter for example) so I can see it.
    Glad to see you appreciate the design 🙂
    Cheers

  2. I Ludo

    Thanks for this guide !
    Works very well
    Just a modify to be more precise about Catalina configuration

    Your words is :
    On line “”, add at the end “jvmRoute=”nodeX“>”, where X is 1 for the first edited node, 2 is for the second edited node, and so on

    Remove the two ” around jvmRoute=”nodeX”>. Just to prevent error

  3. Hi Francois,

    Thanks for your comment.
    As you will see, I updated the line you noticed and also 2 others that were confusing.
    Great to see it helped you 🙂

    Cheers
    Ludo

Leave a Reply

Your email address will not be published. Required fields are marked *