Custom vSphere Template import into AWS as AMI

My current customer asked if they could use the same vSphere template as an AWS AMI. The current vSphere template has a custom disk layout to help them troubleshoot issues. The default single disk layout for AMI’s actually hinders their troubleshooting methodology.

Aside from the custom disk layout, I know VMtools would have to be replaced with cloud-init. Sure no problem. RIGHT! Well actually it wasn’t that hard.

Well I was finally able get it to work, and learned a bunch along the way. Those lessons include,

  • The RHEL default DHCP client is incompatible with AWS.
  • EFI bios is only supported in larger, more expensive instances.
  • AWS VM image import.
  • Make sure to enable ‘disable_vmware_customization’, if that made sense.

Requirements

  • AWS roles, policies and permissions per this document.
  • S3 bucket (packer-import-example) to store the VMDK until it is imported.
  • Basic IAM user (packer) with the correct permissions assigned (see above).
  • vSphere environment to build the image.
  • A RHEL 8.x DVD ISO for installation.
  • HTTP repo to store the kickstart file.

Now down to brass tacks. To be honest it took lots of trial and error (mostly error) to get this working right. For example, on one pass Cloud-Init wouldn’t run on the imported AMI. After looking at cloud.cfg I noticed ‘disable_vmware_customization’ was set to false instead of ‘true’. Another error occurred when my first import attempt failed as the machine did not have a ‘DHCP client’. That was odd as it booted up fine in vSphere and got an IP Address. Apparently AWS only supports certain DHCP clients. Go figure.

Eventually the machine booted properly in AWS, with the user-data applied correctly. The working user-data is in the repo’s cloud-init directory.

And my super simple vRAC blue print even worked. This simple BP adds a new user, assigns a password, and grants it SUDO permissions.

Successful vRA Cloud Deployment

A couple of notes on the packer amazon-import post processor. Those include,

  • The images are encrypted by default, even tho the default for ‘ami_encrypt’ is false by default.
  • ‘ami_name’ requires the AWS permission of ec2:CopyImage on the policy for the import role.
  • Don’t use the default encryption key if you wish to share this. You’ll need a Customer Managed Key (CMK). The import role (vmimport) will need to be a key user. You can set this with ‘ami_kms_key’ set to the Id of the CMK (i.e., ebea!!!!!!!!-aaaa-zzzz-xxxxxxxxxxxxxx)
  • The CMK needs to be shared with the target customer before sharing the AMI. ‘ami_org_arns’ allows you to set the organizations you’d like to share the AMI with.
  • There are lots of import options, you can check them out here.

This working example, plus others I’ve been working are available in this github repo.

Now onto another vRAC adventure.

Packer HCL and PVSCSI drivers

Just this last week I was updating an old Packer build configuration from JSON to HCL. But for the life of me could not get a new vSphere Windows 2019 machine to find a disk attached to Para Virtualized disk controller.

I repeatedly received this error after the machine new machine booted.

Error

In researching error 0x80042405 in C:\Windows\pather\setuperr.log, I found it simply could not find the attached disk.

setuperr.log

After some research I determined the PVSCSI drivers added to the floppy disk where not being discovered. Or more specifically the new machine didn’t know to search the floppy for additional drivers.

I finally found a configuration section for my autounattend.xml file which would fix it after an almost exhaustive online search.

The magic section reads as follows.

<unattend xmlns="urn:schemas-microsoft-com:unattend">
    <settings pass="windowsPE">        
       <component name="Microsoft-Windows-PnpCustomizationsWinPE" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS" xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <DriverPaths>
                <PathAndCredentials wcm:action="add" wcm:keyValue="A">
                    <!-- pvscsi-Windows8.flp -->
                    <Path>A:\</Path>
                </PathAndCredentials>
            </DriverPaths>
        </component>
...
    </settings>

After adding this section, the new vSphere Windows machine easily found the additional drivers.

This was tested against Windows 2019 in both AWS and vSphere deployments.

The vSphere deployment took an hour, mostly waiting for the updates to be applied. AWS takes significantly less time as I’m using the most recently updated image they provide.

The working files are located in the packer-hcl-vsphere-aws github repo.

Code Stream Nested Esxi pipeline Part 2

In this second part, I’ll discuss the actual Code Stream pipeline.

As stated before, the inspiration was William Lams wonderful Power Shell scripts to deploy a nested environment from a CLI. His original logic was retained as much as possible, however due to the nature of K8S a few things had to be changed. I’ll try to address those as they come up.

After some thought I decided to NOT allow the requester to select the amount of Memory, vCPU, or VSAN size. Each Esxi host has 24G of Ram, 4 vCPU, and contributes a touch over 100G to the VSAN. The resulting cluster has 72G of RAM, 12 vCPUs and a roughly 300G VSAN. Only Standard vSwitches are configured in each host.

The code, pipeline and other information is available on this github repo.

Deployment of the Esxi hosts is initiated by ‘deployNestedEsxi.ps1’. There are few changes from the original script.

  1. The OVA configuration is only grabbed once. Then only the specific host settings (IP Address and Name are changed.
  2. The hosts are moved into a vApp once built.
  3. The NetworkAdapter settings are performed after deployment.
  4. Persisted the log to /var/workspace_cache/logs/vsphere-deployment-$BUILDTIME.log.

Deployment of the vCSA is handled by ‘deployVcsa.ps1’ Some notable changes from the original code include.

  1. Hardcoded the SSO username to administrator@vsphere.local.
  2. Hardcoded the size to ‘tiny’.
  3. Save the log file to /var/workspace_cache/logs/NestedVcsa-$BUILDTIME.log.
  4. Save the configuration template to /var/workspace_cache/vcsajson/NestedVcsa-$BUILDTIME.json.
  5. Move the VCSA into the vApp after deployment is complete.

And finally ‘configureVc.ps1’ sets up the Cluster and VSAN. Some changed include.

  1. Hardcoded the Datacenter name (DC), and Cluster (CL1).
  2. Import the Esxi hosts by IP (No DNS records setup for the hosts or vCenter).
  3. Append the configuration results to /var/workspace_cache/logs/vsphere-deployment-$BUILDTIME.log.

So there you go, down and simple Code Stream pipeline to deploy a nested vSphere environment in about an hour.

Stay tuned. The next article will include an NSX-T deployment.

Code Stream Nested Esxi pipeline Part 1

Been a while since my last post. Over the last couple of months I’ve been tinkering with using Code Stream to deploy a Nested Esxi / vCenter environment.

My starting point is William Lams excellent PowerShell script (vsphere-with-tanzu-nsxt-automated-lab-deployment). I also wanted to use the official vmware/poweclicore docker image.

Well let’s just say it’s been an adventure. Much has been learned through trial and (mostly) error.

For example in Williams script, all of the files are located on the workstation where the script runs. Creating a custom docker image with those files would have resulted in a HUGE file, almost 16GB (Nested ESXi appliance, vCSA appliance and supporting files, and NSX-T OVA files). As one of my co-worker says, “Don’t be that guy”.

At first I tried cloning the files into the container as part of the CI setup. Downloading the ESXi OVA worked fine, but failed when I tried copying over the vCSA files. I think it’s just too much.

I finally opted to use a Kubernetes Code Stream instead of a Docker pipeline. This allowed me to use a Persistent Volume Claim.

Kubernetes setup

Some of the steps may lack details, as this has been an ongoing effort and just can’t remember everything. Sorry peeps!

Create two Name Spaces, codestream-proxy and codestream-workspace. Codestream-proxy is used by Code Stream to host a Proxy pod.

Codestream-workspace will host the containers running the pipeline code.

Next came the service account for Code Stream. The path of least resistance was to simply assign ‘cluster-admin’ to the new service account. NOTE: Don’t do this in a production environment.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cs-cluster-role-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: codestream
  apiGroup: ""
  namespace: default

Next came the Persistent Volume (pv) and Persistent Volume Claim (pvc). My original pv was set to 20GI, which after some testing was determined too small. It was subsequently increased it to 30GI. The larger pv allowed me to retain logs and configurations between runs (for troubleshooting).

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
  name: cs-persistent-volume-cw
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 30Gi
  hostPath:
    path: /mnt/nested
  persistentVolumeReclaimPolicy: Retain
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cs-pvc-cw
  namespace: codestream-workspace
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 30Gi
  volumeName: cs-persistent-volume-cw

The final step in k8s is to get the Service Account token. In this example the SA is called ‘codestream’ (So creative).

k get secret codestream-token-blah!!! -o jsonpath={.data.token} | base64 -d | tr -d "\n"

eyJhbGciOiJSUzI1NiIsImtpZCI6IncxM0hIYTZndS1xcEdFVWR2X1Z4UFNLREdQcGdUWDJOWUF1NDE5YkZzb.........

Copy the token, then head off to Code Stream.

Codestream setup

There I added a Variable to hold the token, called DAG-K8S-Secret.

Then went over to Endpoints, where I added a new Kubernetes endpoint.

Repo setup

The original plan was to download the OVA/OVF files from a repo every time the pipeline ran. However an error would occur on every VCSA file set download. Adding more memory to the container didn’t fix the problem, so I had to go in another direction.

The repo is well connected to the k8s cluster, so the transfer is pretty quick. Here is the directory structure for the repo (http://repo.corp.local/repo/).

NOTE: You will need a valid account to download VCSA and NSX-T.

NOTE: NSX-T will be added to the pipeline later.

Simply copying the files interactively on the k8s node seemed like the next logical step. Yes the files copied over nicely, but any attempt to deploy the VCSA appliance would throw a python error complaining about a missing ‘vmware’ module.

However I was able to run the container manually, copy the files over and run the scripts successfully. Maybe a file permissions issue?

Finally I ran the pipeline with a long sleep at the beginning. Using an interactive session, and copied the files over. This fixed the problem.

Here are the commands I used to copy the files over interactively.

k -n codestream-workspace exec -it po/running-cs-pod-id bash
wget -mxnp -q -nH http://repo.corp.local/repo/ -P /var/workspace_cache/ -R "index.html*"
# /var/workspace_cache is the mount point for the persistent volume
# need to chmod +x a few files to get the vCSA to deploy
chmod +x /var/workspace_cache/repo/vcsa/VMware-VCSA-all-7.0.3/vcsa/ovftool/lin64/ovftool*
chmod +x /var/workspace_cache/repo/vcsa/VMware-VCSA-all-7.0.3/vcsa/vcsa-cli-installer/lin64/vcsa-deploy*

This should do it for now. The next article will cover some of the pipeline details, and some of the changes I had to make to William Lams Powershell code.

Happy holidays.

Cloud Extensibility Appliance vRO Properties using PowerShell

In this article I’ll show you how to return JSON as a vRO Property type using vRA Cloud Extensibility Proxy (CEXP) vRO PowerShell 7 scriptable tasks.

First a couple of notes about the CEXP.

  • It is BIG, 32GB of RAM. However my lab instance is using less than 7 GB active memory.
  • 8 vCPU, and runs about 50% on average.
  • It deploys with 4 disks, using a tad less than 210 GB.

Why PowerShell 7? Well it was a design decision based on the customers PS proficiency.

Now down to the good stuff. Here are the details of this basic workflow using PowerShell 7 as Scriptable Tasks.

  • Get a new vRA Cloud Bearer Token
    • Save it, along with other common header values to an output variable named ‘headers’ (Properties)
  • The second scriptable task will use the header and apiEndpoint to GET the vRAC version information (About).
    • Then save version information to an output variable named ‘vRacAbout’ (Properties)

Getting (actually POST) the bearerToken is fairly simple. Here is the code for the first task.

function Handler($context, $inputs) {
    <#
    .PARAMETER $inputs.refreshToken (SecureString)
        vRAC Refresh Token

    .PARAMETER $inputs.apiEndpoint (String)
        vRAC Base API URL

    .OUTPUT headers (Properties)
        Headers including the bearerToken

    #>
    $body = @{ refreshToken = $inputs.refreshToken } | ConvertTo-Json

    $headers = @{'Accept' = 'application/json'
                'Content-Type' = 'application/json'}
    
    $Uri = $inputs.apiEndpoint + "/iaas/api/login"
    $requestResponse = Invoke-RestMethod -Uri $Uri -Method Post -Body $body -Headers $headers 

    $bearerToken = "Bearer " + $requestResponse.token 
    $authorization = @{ Authorization = $bearerToken}
    $headers += $authorization

    $output=@{headers = $headers}

    return $output
}

The second task consumes the headers produced by the first task, then GET(s) the Version Information from the vRA Cloud About route (‘/iaas/api/about’). The results are then returned as the vRacAbout (Properties) variable.

function Handler($context, $inputs) {
    <#
    .PARAMETER $inputs.headers (Properties)
        vRAC Refresh Token

    .PARAMETER $inputs.apiEndpoint (String)
        vRAC Base API URL

    .OUTPUT vRacAbout (Properties)
        vRAC version information from the About route

    #>
    $requestUri += $inputs.apiEndpoint + "/iaas/api/about"
    $requestResponse = Invoke-RestMethod -Uri $requestUri -Method Get -Headers $headers

    $output=@{vRacAbout = $requestResponse}

    return $output
}

Here, you can see the output variables for both tasks are populated. Pretty cool.

As you can see, using the vRO Properties type is fairly simple using the PowerShell on CEXP vRO.

The working workflow package is available here.

Happy coding.