Packer HCL and PVSCSI drivers

Just this last week I was updating an old Packer build configuration from JSON to HCL. But for the life of me could not get a new vSphere Windows 2019 machine to find a disk attached to Para Virtualized disk controller.

I repeatedly received this error after the machine new machine booted.

Error

In researching error 0x80042405 in C:\Windows\pather\setuperr.log, I found it simply could not find the attached disk.

setuperr.log

After some research I determined the PVSCSI drivers added to the floppy disk where not being discovered. Or more specifically the new machine didn’t know to search the floppy for additional drivers.

I finally found a configuration section for my autounattend.xml file which would fix it after an almost exhaustive online search.

The magic section reads as follows.

<unattend xmlns="urn:schemas-microsoft-com:unattend">
    <settings pass="windowsPE">        
       <component name="Microsoft-Windows-PnpCustomizationsWinPE" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <DriverPaths>
                <PathAndCredentials wcm:action="add" wcm:keyValue="A">
                    <!-- pvscsi-Windows8.flp -->
                    <Path>A:\</Path>
                </PathAndCredentials>
            </DriverPaths>
        </component>
...
    </settings>

After adding this section, the new vSphere Windows machine easily found the additional drivers.

This was tested against Windows 2019 in both AWS and vSphere deployments.

The vSphere deployment took an hour, mostly waiting for the updates to be applied. AWS takes significantly less time as I’m using the most recently updated image they provide.

The working files are located in the packer-hcl-vsphere-aws github repo.

Code Stream Nested Esxi pipeline Part 2

In this second part, I’ll discuss the actual Code Stream pipeline.

As stated before, the inspiration was William Lams wonderful Power Shell scripts to deploy a nested environment from a CLI. His original logic was retained as much as possible, however due to the nature of K8S a few things had to be changed. I’ll try to address those as they come up.

After some thought I decided to NOT allow the requester to select the amount of Memory, vCPU, or VSAN size. Each Esxi host has 24G of Ram, 4 vCPU, and contributes a touch over 100G to the VSAN. The resulting cluster has 72G of RAM, 12 vCPUs and a roughly 300G VSAN. Only Standard vSwitches are configured in each host.

The code, pipeline and other information is available on this github repo.

Deployment of the Esxi hosts is initiated by ‘deployNestedEsxi.ps1’. There are few changes from the original script.

  1. The OVA configuration is only grabbed once. Then only the specific host settings (IP Address and Name are changed.
  2. The hosts are moved into a vApp once built.
  3. The NetworkAdapter settings are performed after deployment.
  4. Persisted the log to /var/workspace_cache/logs/vsphere-deployment-$BUILDTIME.log.

Deployment of the vCSA is handled by ‘deployVcsa.ps1’ Some notable changes from the original code include.

  1. Hardcoded the SSO username to administrator@vsphere.local.
  2. Hardcoded the size to ‘tiny’.
  3. Save the log file to /var/workspace_cache/logs/NestedVcsa-$BUILDTIME.log.
  4. Save the configuration template to /var/workspace_cache/vcsajson/NestedVcsa-$BUILDTIME.json.
  5. Move the VCSA into the vApp after deployment is complete.

And finally ‘configureVc.ps1’ sets up the Cluster and VSAN. Some changed include.

  1. Hardcoded the Datacenter name (DC), and Cluster (CL1).
  2. Import the Esxi hosts by IP (No DNS records setup for the hosts or vCenter).
  3. Append the configuration results to /var/workspace_cache/logs/vsphere-deployment-$BUILDTIME.log.

So there you go, down and simple Code Stream pipeline to deploy a nested vSphere environment in about an hour.

Stay tuned. The next article will include an NSX-T deployment.

Code Stream Nested Esxi pipeline Part 1

Been a while since my last post. Over the last couple of months I’ve been tinkering with using Code Stream to deploy a Nested Esxi / vCenter environment.

My starting point is William Lams excellent PowerShell script (vsphere-with-tanzu-nsxt-automated-lab-deployment). I also wanted to use the official vmware/poweclicore docker image.

Well let’s just say it’s been an adventure. Much has been learned through trial and (mostly) error.

For example in Williams script, all of the files are located on the workstation where the script runs. Creating a custom docker image with those files would have resulted in a HUGE file, almost 16GB (Nested ESXi appliance, vCSA appliance and supporting files, and NSX-T OVA files). As one of my co-worker says, “Don’t be that guy”.

At first I tried cloning the files into the container as part of the CI setup. Downloading the ESXi OVA worked fine, but failed when I tried copying over the vCSA files. I think it’s just too much.

I finally opted to use a Kubernetes Code Stream instead of a Docker pipeline. This allowed me to use a Persistent Volume Claim.

Kubernetes setup

Some of the steps may lack details, as this has been an ongoing effort and just can’t remember everything. Sorry peeps!

Create two Name Spaces, codestream-proxy and codestream-workspace. Codestream-proxy is used by Code Stream to host a Proxy pod.

Codestream-workspace will host the containers running the pipeline code.

Next came the service account for Code Stream. The path of least resistance was to simply assign ‘cluster-admin’ to the new service account. NOTE: Don’t do this in a production environment.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cs-cluster-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: codestream
  apiGroup: ""
  namespace: default

Next came the Persistent Volume (pv) and Persistent Volume Claim (pvc). My original pv was set to 20GI, which after some testing was determined too small. It was subsequently increased it to 30GI. The larger pv allowed me to retain logs and configurations between runs (for troubleshooting).

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
  name: cs-persistent-volume-cw
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 30Gi
  hostPath:
    path: /mnt/nested
  persistentVolumeReclaimPolicy: Retain
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cs-pvc-cw
  namespace: codestream-workspace
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 30Gi
  volumeName: cs-persistent-volume-cw

The final step in k8s is to get the Service Account token. In this example the SA is called ‘codestream’ (So creative).

k get secret codestream-token-blah!!! -o jsonpath={.data.token} | base64 -d | tr -d "\n"

eyJhbGciOiJSUzI1NiIsImtpZCI6IncxM0hIYTZndS1xcEdFVWR2X1Z4UFNLREdQcGdUWDJOWUF1NDE5YkZzb.........

Copy the token, then head off to Code Stream.

Codestream setup

There I added a Variable to hold the token, called DAG-K8S-Secret.

Then went over to Endpoints, where I added a new Kubernetes endpoint.

Repo setup

The original plan was to download the OVA/OVF files from a repo every time the pipeline ran. However an error would occur on every VCSA file set download. Adding more memory to the container didn’t fix the problem, so I had to go in another direction.

The repo is well connected to the k8s cluster, so the transfer is pretty quick. Here is the directory structure for the repo (http://repo.corp.local/repo/).

NOTE: You will need a valid account to download VCSA and NSX-T.

NOTE: NSX-T will be added to the pipeline later.

Simply copying the files interactively on the k8s node seemed like the next logical step. Yes the files copied over nicely, but any attempt to deploy the VCSA appliance would throw a python error complaining about a missing ‘vmware’ module.

However I was able to run the container manually, copy the files over and run the scripts successfully. Maybe a file permissions issue?

Finally I ran the pipeline with a long sleep at the beginning. Using an interactive session, and copied the files over. This fixed the problem.

Here are the commands I used to copy the files over interactively.

k -n codestream-workspace exec -it po/running-cs-pod-id bash
wget -mxnp -q -nH http://repo.corp.local/repo/ -P /var/workspace_cache/ -R "index.html*"
# /var/workspace_cache is the mount point for the persistent volume
# need to chmod +x a few files to get the vCSA to deploy
chmod +x /var/workspace_cache/repo/vcsa/VMware-VCSA-all-7.0.3/vcsa/ovftool/lin64/ovftool*
chmod +x /var/workspace_cache/repo/vcsa/VMware-VCSA-all-7.0.3/vcsa/vcsa-cli-installer/lin64/vcsa-deploy*

This should do it for now. The next article will cover some of the pipeline details, and some of the changes I had to make to William Lams Powershell code.

Happy holidays.

VMware Code Stream saved Custom Integration version issues

First the bad news. The VMware Code Stream Cloud version has a limit of 300 saved Custom Integrations and versions. Your once working pipelines will all of sudden get a validation error of “The saved Custom Integration version is not longer available” if you exceed this limit.

Now the not so good news. They haven’t fixed it yet!

And now the details.

vExpert 2021 Applications are open

The 2021 vExpert applications are now open!

The program “is about giving back to the community beyond your day job”.

One way I give back is by posting new and unique content here once or twice a month. Sometimes a post is simply me clearing a thought before the weekend, completing a commitment to a BU, or documenting something before moving on to another task. It doesn’t take long, but could open the door for one of my peers.

My most frequently used benefit is the vExpert and Cloud Management Slack channels. I normally learn something new every-week. And it sure does feel good to help a peer struggling with something I’ve already tinkered with.

Here’s a list of some of the benefits for receiving the award.

  • Networking with 2,000+ vExperts / Information Sharing
  • Knowledge Expansion on VMware & Partner Technology
  • Opportunity to apply for vExpert BU Lead Subprograms
  • Possible Job Opportunities
  • Direct Access to VMware Business Units via Subprograms
  • Blog Traffic Boost through Advocacy, @vExpert, @VMware, VMware Launch & Announcement Campaigns
  • 1 Year VMware Licenses for Home Labs for almost all Products & Some Partner Products
  • Private VMware & VMware Partner Sessions
  • Gifts from VMware and VMware Partners
  • vExpert Celebration Parties at both VMworld US and VMworld Europe with VMware CEO, Pat Gelsinger
  • VMware Advocacy Platform Invite (share your content to thousands of vExperts & VMware employees who amplify your content via their social channels)
  • Private Slack Channels for vExpert and the BU Lead Subprograms

The applications close on January 9th, 2021. Start working on those applications now.

VMware Cloud Assembly Custom Integration lessons learned

I’ve recently spent some time refactoring Sam McGeown’s Image as Code (IaC) Codestream pipeline for VMware vRealize Automation Cloud. The following details some of the lessons learned.

First off, I found the built in Code Stream REST tasks do not have a retry. I learned this the hard way when they had issues with their cloud offering last month. At times it would get a 500 error back when making a request, resulting in a failed execution.

This forced me to look at Python custom integrations which would retry until the correct success code was returned. I was able to get the pipeline working, but it did have a lot of repetitive code, lacked the ability to limit the number of retries, and was based on Python 2.

Seeing the error of my ways, I decided to again refactor the code with a custom module (For the repetitive code), and migrate to Python 3.

The original docker image was CentOS based and did not have Python 3 installed. Instead of just installing Python 3 thus increasing the size of the image, I opted to start with the Docker Official Python 3 image. I’ll get to the build file later.

Now on to the actual refactoring. Here I wanted to combine the reused code into a custom python module. My REST calls include POST (To get a Bearer Token), GET (With and without a Filter), PATCH (To update Image Mappings), and DELETE (To delete the test Image Profile and Cloud Template).

This module snippet includes PostBearerToken_session which returns the bearToken and other headers. GetFilteredVrac_sessions returns a filtered GET request. It also limits the retries to 5 attempts.

import requests
import logging
import os 
import json 
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

# retry strategy to tolerate API errors.  Will retry if the status_forcelist matches.
retry_strategy = Retry (
    total=5,
    status_forcelist=[429, 500, 502, 503, 504],
    method_whitelist=["GET", "POST"],
    backoff_factor=2,
    raise_on_status=True,
)


adapter = HTTPAdapter(max_retries=retry_strategy)
https = requests.Session()
https.mount("https://", adapter)

vRacUrl = "https://api.mgmt.cloud.vmware.com"

def PostBearerToken_session(refreshToken):
    # Post to get the bearerToken from refreshToken
    # Will return the headers with Authorization populated
    # Build post payload 
    pl = {}
    pl['refreshToken'] = refreshToken.replace('\n', '')
    logging.info('payload is ' + json.dumps(pl))
    loginURL = vRacUrl + "/iaas/api/login"
    headers = {
        "Accept": "application/json",
        "Content-Type": "application/json"
        }
    r = https.post(loginURL, json=pl, headers=headers)
    responseJson = r.json()
    token = "Bearer " + responseJson["token"]
    headers['Authorization']=token
    return headers 

def GetFilteredVrac_sessions(requestUrl, headers, requestFilter):
    # Get a thing using a filter
    requestUrl = vRacUrl + requestUrl + requestFilter
    print(requestUrl)
    adapter = HTTPAdapter(max_retries=retry_strategy)
    https = requests.Session()
    https.mount("https://", adapter)
    r = https.get(requestUrl, headers=headers)
    responseJson = r.json()
    return responseJson

Here is the working Code Stream Custom Integration used to test this module. It will get the headers, then send a filtered request using the Cloud Account Name. It then pulls some information out of the Payload and prints it out. (Sorry for formatting).


 runtime: "python3"
 code: |
   import json 
   import requests
   import os 
   import sys
   import logging
 # append /build to the path
   # this is where the custom python module is copied to
   sys.path.append('/build')
   import vRAC
 # context.py is automatically added.
   from context import getInput, setOutput
   def main():
def main():
   refreshToken=getInput('RefreshToken')
   # now with new model
   # authHeaders will have all of the required headers, including the bearerToken
   authHeaders=vRAC.PostBearerToken_session(refreshToken)
   # test the getFunction with filter
   requestUrl = "/iaas/api/cloud-accounts"
   requestFilter = "?$filter=name eq '" + getInput('cloudAccountName') + "'" 
   # get the cloudAccount by name    
   cloudAccountJson=vRAC.GetFilteredVrac_sessions(requestUrl, authHeaders, requestFilter) 
   # get some specific data out for later
   cloudAccountId = cloudAccountJson['content'][0]['id']   
   logging.info('cloudAccountId: ' + cloudAccountId)
 if name == 'main':
       main()
 inputProperties:      # Enter fields for input section of a task
   # Password input
 - name: RefreshToken
   type: text
   title: vRAC Refresh Token
   placeHolder: 'secret/password field'
   defaultValue: changeMe
   required: true
   bindable: true
   labelMessage: vRAC RefreshToken 

 - name: cloudAccountName
   type: text 
   title: Cloud Account Name
   placeHolder: vc.corp.local
   defaultValue: vc.corp.local
   required: true
   bindable: true
   labelMessage: Cloud Account Name  

Next the new Docker image. This uses the official Python 3 image as a starting point. The build file copies everything over (Including the custom module and requirements.txt), then installs ‘requests’.

FROM python:3

WORKDIR /build

COPY . ./
RUN pip install --no-cache-dir -r requirements.txt

Now that the frame work is ready, it’s time to create the pipeline and test it. This is well documented here Creating and using pipelines in VMware Code Stream. Update the Host field with your predefined Docker Host, set the Builder image URL (Docker Hub repo and tag), and set the Working directory to ‘/build’ (to match WORKDIR in the Dockerfile).

Running the pipeline worked and returned the requested information.

This was a fairly brief article. I really just wanted to get everything written down before the weekend. I’ll have more in the near future.

VMWare vRealize Automation Cloud Image Profiles lessons learned

Over the last month I’ve spend most of my time trying to replicate Sam McGeown’s Build, test and release VM images with vRealize Automation Code Stream and Packer in vRealize Automation Cloud. One of the tasks is to create or update an Image Profile. The following article details some of the things I’ve learned.

First off, the Image Mappings shown in the UI cannot be updated via the API directly. The API allows you create, change and delete Image Profiles. An Image Profile contains the Image Mappings and is tied to a Region.

For example, in this image the IaC Image Mappings are displayed.

Here, some of same Image Mappings as seen when you GET the Image Profile by id.

{
"imageMapping": {
"mapping": {
"IaC-build-test-patch": {
"id": "8fc331632163f53fd0c66e0407495504295b4c1c",
"name": "",
"description": "Template: vSphere-CentOS8-CUSTOM-2020.09.25.181019"
},
"IaC-prod-profile": {
"id": "2e2d31be93c59531d2c1eeeadc58f68b66174559",
"name": "",
"description": "Template: vSphere-CentOS8-CUSTOM-2020.09.25.144314"
},
"IaC-test-profile": {
"id": "842c91f05185978d62d201df3b47d1505cf3fea3",
"name": "",
"description": "Generic CentOS 7 template with cloud-init installed and VM hardware version 13 (compatible with ESXi 6.5 or greater)."
}
}
},
"regionId": "71cecc477594a67558b9d5xxxxxxx",
"name": "IaC-build-test-profile",
"description": "Packer build image for testing"
}

But how do you get the Image Profile Id? I ended up using a filter based on the externalRegionId (I used another filtered search to find the externalRegionId by Region Name).

https://api.mgmt.cloud.vmware.com/iaas/api/image-profiles?$filter=externalRegionId eq 'Datacenter:datacenter-xyz'

This returned the following payload (Cleaned up bit). I reference this later as cloudAccountJson.

{
"content": [
{
"imageMappings": {
"mapping": {
"IaC-prod-profile": {
...
},
"IaC-build-test-patch": {
...
},
"IaC-test-profile": {
...
}
},
"externalRegionId": "",
...
},
"externalRegionId": "Datacenter:datacenter-xyz",
...
"name": "IaC-build-test-profile",
"description": "Packer build image for testing",
"id": "fa57fef8-5d0e-494b-b299-7e4a9030ac11-71cecc477594a67558b9d5f056260",
...
],
"totalElements": 1,
"numberOfElements": 1
}

The id will be used later to update (PATCH) the Image Profile.

Now to build the PATCH body to update the Profile. The API has the following body example.

{ "name": "string", "description": "string", "imageMapping": "{ \"ubuntu\": { \"id\": \"9e49\", \"name\": \"ami-ubuntu-16.04-1.9.1-00-1516139717\"}, \"coreos\": { \"id\": \"9e50\", \"name\": \"ami-coreos-26.04-1.9.1-00-543254235\"}}", "regionId": "9e49" }

Then using cloudAccountJson returned previously, I built a new body using the following (partial) code (I couldn’t get the formatting right, hence the image.)

Now some gotcha’s.

First, remember that the image mappings are tied to the region. You will loose any Image Mappings NOT included in the POST/PATCH Body. Make sure you back up the Image Profile settings (Do a get by Image Profile Id) before attempting to change the mappings via the API.

Secondly, an Image Profile does not have a name by default. You need to set this via the API. Why would you need it? Well you may want to find the Image Profile by name later. My current customer creates new customer Image Profiles via the API and uses a ‘tag’ like naming convention.

Thirdly, I’ve experienced several 500 errors when interfacing with the vRA Cloud API. The out of box Code Stream REST tasks do not retry. I ended up writing python custom integrations as a work around. These retry until receiving the correct response code (I’ve seen up to 15 500 errors before getting a 200).

This is just one thing I’ve learned about the vRA Cloud API, and Code Stream. I’ll post more as I have time.