VMware PKS Bosh CLI client SSL trust

This past week or so has been spent deploying VMware PKS Enterprise in my lab.  My main installation guide was provided by Pivotal’s Installing Enterprise PKS on vSphere with NSX-T.  

All was going well until I tried to deploy a cluster.  I could see the machines being deployed in vCenter and various NSX-T components being deployed.  However, the cluster deployment failed with the following error.

Name:                     cluster-03
Plan Name:                small
UUID:                     ad4ba957-35bc-4500-ace8-1cfda0238a83
Last Action:              CREATE
Last Action State:        failed
Last Action Description:  Instance provisioning failed: 
..... task-id: 289, ... result: 2 of 7 pre-start scripts failed. Failed Jobs: ...

As you can see task 289 failed.  Now how the heck do I get the details of the failed task?

The bosh cli client appeared to be the answer. Reading further it looked like I needed to set some environment variables to make it work properly.

After reading a few online documents, I was able to find the Bosh Command Line Credentials (Actually the bosh environment variables) by clicking on the Bosh Tile in Operations Manager, clicking on the Credentials tab, then clicking the link next to Bosh Commandline Credentials.

boshCreds

The provided BOSH_CA_CERT path and file do not exist on my jump machine.  I was able to download the root CA following these steps. (Installing uaac is beyond the scope of this document).

uaac target https://opsman.corp.local/uaa --skip-ssl-validation

uaac token owner getClient ID: opsman
Client secret:
User name: admin
Password: *******

uaac contexts

Copy the admin bearer token from the client_id section (the token is actually called access_token).

[0]*[https://pks.pks.corp.local:8443]

  skip_ssl_validation: true

  ca_cert: root_ca_certificate

  [0]*[admin]

      client_id: admin

      access_token: eyJhbGci .....

      token_type: bearer

Finally downloading the certificate to my jump machine.

curl https://opsman.corp.local/api/v0/security/root_ca_certificate -X GET -H "Authorization: Bearer eyJhbGci ....." -k > root_ca_certificate

My reformatted bosh environment settings, along with the correct path to my certificate ended up like this.

export BOSH_CLIENT=ops_manager 
export BOSH_CLIENT_SECRET=MP0................Blah! 
export BOSH_CA_CERT=/root/root_ca_certificate <--- Correct path and file
export BOSH_ENVIRONMENT=bosh.corp.local

After pasting the variables into my console, I attempted to get the details from the failed task.

bosh task 289 
Validating Director connection config:
  Parsing certificate 1: Missing PEM block
Exit code 1

What the heck?  Apparently the downloaded certificate is actually in JSON format, AND it includes ‘\n’ as line returns.

{"root_ca_certificate_pem":"-----BEGIN CERTIFICATE-----\nMIIDUDCC...
...
....
cQswzKxnm8ZfedoVheV9OBnYQyrHV2ePG/W+kfCoqXD\n ....
CeEzZD6ZicGuv7KcYNP\n...\n-----END CERTIFICATE-----\n"}

Using Notepad ++ I replaced all of the ‘\n’ with a line return.

nppFandR

Then I removed the quotes, brackets , root_ca_certificate.pem section, and deleted all of the other newlines leaving me with a clean certificate (Each line needs to be 64 characters long).

formattedCert

After saving this on my machine, I attempted to run the command again, this time using the –ca-cert option pointing to the new certificate.

bosh task 289 --ca-cert root_ca_2.pem 
Using environment 'bosh.corp.local' as client 'ops_manager'

Task 289
....

Task 289 | 16:58:24 | Updating instance master: master/51431548-e35a-471b-853f-26dc7eca9f7c (0) (canary) (00:02:06)

                    L Error: Action Failed get_task: Task ...
Task 289 | 17:00:30 | Error: Action Failed get_task: ...
Task 289 Started  Wed May  8 16:55:34 UTC 2019
Task 289 Finished Wed May  8 17:00:30 UTC 2019
Task 289 Duration 00:04:56
Task 289 error

Capturing task '289' output:
  Expected task '289' to succeed but state is 'error'
Exit code 1

Success!

Now all all I need to do is figure out the error.  Oh joy!

 

Nested NSX-T cluster on vSphere 6.7U1

I took some time this week to update William Lams Nested vSphere 6.5 with NSX-T to Nested vSphere 6.7U1/NSX-T 2.3.1, to kickstart a new customer project.

This version updated some of the vCSA OVF JSON fields, and added support for PowerShell 6.1. It was tested against PowerCLI 11.2 on a Windows 10 machine.

The original post can be found at virtuallyghetto.

You can find the new, updated file(s) at nested-nsxt231-vsphere67u1.