Managing Azure AKS clusters with VMWare Aria Automation

The use case presented to me for POC was to deploy a new Azure AKS cluster then install a basic application. Simple use case, but for those using VRA you know the kubernetes capabilities are all but non existent.

But after digging around and tinkering I figured a CodeStream (now called Pipelines) would probably fit the bill. The pipeline would run a terraform plan to build, and then destroy the deployment later on.

Keeping track of the state file between runs also presented a ‘problem’. After lots of kicking the tires I came up with a way to store the state file securing in an Azure Storage account. The state file in the container is simply the deployment name plus .tfstate. This allowed me to refer to it using day two actions and Event Broker Subscriptions (EBS).

Another issue that came up was deleting ‘codestream.execution’ resources when the deployment is deleted. Since these deployments are handled by terraform I needed another WF which called a pipeline to destroy the deployment when the eventType was DESTROY_DEPLOYMENT.

The files for this article can be found at azure-terraform-blog

Terraform is used to do the heavy lifting. The backend values get replaced with some pipeline inputs in the first pipeline task. The most important one is the deployment name. When destroying the deployment, terraform will pull the current state for that deployment and do its thing.

The CodeStream pipeline (Now Pipelines) uses a custom docker image. It includes the latest version of Terraform (Currently at 1.5.4), AZ CLI, Kubectl, and Helm (for another use case). It is stored on DockerHub as americanbwana/cas-terraform-154:latest.

I didn’t come up with the basic Template. I found this article on vEducate.co.uk. A very good starting point. ‘pipelineTask’ is used by the pipeline to either create (apply) or destroy the deployment. More on that later.

formatVersion: 1
inputs:
  pipelineTask:
    type: string
    title: Pipeline Task
    description: 'Create '
    readOnly: true
    default: create
resources:
  cs.pipeline:
    type: codestream.execution
    properties:
      pipelineId: 2b80427c...
      outputs:
        computed: true
      inputs:
        deploymentName: ${env.deploymentName}
        pipelineTask: ${input.pipelineTask}

vRA doesn’t delete the actual codestream.execution items when you destroy the deployment. A workflow called ‘Terraform delete AKS and Helm deployment’ is called by an Event Broker Subscription (EBS). Make sure to update the ‘codestreamPipelineId’ in the WF variables.

And finally on to the pipeline. The initialize task copies several variables into a file, which is then sourced by most stages. Terraform apply is only fired if the pipelineTask = ‘create’. And Terraform destroy is only fired when pipelineTask = ‘destroy’.

‘Get Service IP’ is also only fired if the pipelineTask = ‘create’. This task will get the IP address of WordPress and export it back to vRA.

Nuff for now. Happy coding.

vRO Action PowerShell Zip importing and use

One of my current tasks is to leverage vRealize Automation Orchestrator to meet the following use case.

Get the next available subnet from InfoBlox
Reserve the gateway and other IP’s in the new subnet
Create a new NSX-T segment
Create new NSX-T security groups
Discover the new segment in vRealize Automation Cloud
Assign the new InfoBlox to the discovered Fabric Network
Create a new Network Profile in vRA Cloud

This weeks goal was to get the InfoBlox part working. Well I had it working two years ago, but couldn’t remember how I did it (CRS).

Today I’ll discuss how to use vRO to get the next available subnet from InfoBlox. The solution uses a PSM I build, along with the PowerShell script which actually does the heavy lifting. One key difference between my solution and the one from VMware’s documentation is the naming of the zip file. This affects how to import and use it in vRO.

The code used in this example is available in this GitHub repo. Clone the repo, then run the following command to zip up the files.

zip -r -x ".git/*" -x "README.md" -X nextibsubnet.zip .

Next import the zip file into vRO, add some inputs, modify the output, then finally run it.

Within vRO, add a new Action. Then change the script type to “PowerCli 12 (PowerShell 7.1).

Change the Type to ‘Zip’ by clicking on the dropdown under ‘Type’ and selecting ‘Zip’

Click ‘Import’, then browse to the folder containing the zip file from earlier in the article.

You will notice the name is not ‘nextibsubnet.zip’ but InfobloxGetNextAvailableSubnet.zip. The imported zip assumes the name of the vRO Action.

Now the biggest difference between my approach and the VMware way. If you look at the cloned folder you will see a file named ‘getNextAvailableIbSubnet.ps1’. The VMware document called this file ‘handler.ps1’. Instead of putting in ‘handler.handler’ in ‘Entry Handler’, I’ll use ‘getNextAvailableIbSubnet.handler. This tells vRO to look inside ‘getNextAvailableIbSubnet.ps1’ for a function called ‘handler’.

Next we need to change the return type to Properties, and add a few inputs.

Save and run. And if everything is in order, you should get the next InfoBlox subnet from 10.10.0.0/24. The results from the action run.

So there you go. Now on to the next adventure.

Custom vSphere Template import into AWS as AMI

My current customer asked if they could use the same vSphere template as an AWS AMI. The current vSphere template has a custom disk layout to help them troubleshoot issues. The default single disk layout for AMI’s actually hinders their troubleshooting methodology.

Aside from the custom disk layout, I know VMtools would have to be replaced with cloud-init. Sure no problem. RIGHT! Well actually it wasn’t that hard.

Well I was finally able get it to work, and learned a bunch along the way. Those lessons include,

The RHEL default DHCP client is incompatible with AWS.
EFI bios is only supported in larger, more expensive instances.
AWS VM image import.
Make sure to enable ‘disable_vmware_customization’, if that made sense.

Requirements

AWS roles, policies and permissions per this document.
S3 bucket (packer-import-example) to store the VMDK until it is imported.
Basic IAM user (packer) with the correct permissions assigned (see above).
vSphere environment to build the image.
A RHEL 8.x DVD ISO for installation.
HTTP repo to store the kickstart file.

Now down to brass tacks. To be honest it took lots of trial and error (mostly error) to get this working right. For example, on one pass Cloud-Init wouldn’t run on the imported AMI. After looking at cloud.cfg I noticed ‘disable_vmware_customization’ was set to false instead of ‘true’. Another error occurred when my first import attempt failed as the machine did not have a ‘DHCP client’. That was odd as it booted up fine in vSphere and got an IP Address. Apparently AWS only supports certain DHCP clients. Go figure.

Eventually the machine booted properly in AWS, with the user-data applied correctly. The working user-data is in the repo’s cloud-init directory.

And my super simple vRAC blue print even worked. This simple BP adds a new user, assigns a password, and grants it SUDO permissions.

A couple of notes on the packer amazon-import post processor. Those include,

The images are encrypted by default, even tho the default for ‘ami_encrypt’ is false by default.
‘ami_name’ requires the AWS permission of ec2:CopyImage on the policy for the import role.
Don’t use the default encryption key if you wish to share this. You’ll need a Customer Managed Key (CMK). The import role (vmimport) will need to be a key user. You can set this with ‘ami_kms_key’ set to the Id of the CMK (i.e., ebea!!!!!!!!-aaaa-zzzz-xxxxxxxxxxxxxx)
The CMK needs to be shared with the target customer before sharing the AMI. ‘ami_org_arns’ allows you to set the organizations you’d like to share the AMI with.
There are lots of import options, you can check them out here.

This working example, plus others I’ve been working are available in this github repo.

Now onto another vRAC adventure.

Code Stream Nested Esxi pipeline Part 2

In this second part, I’ll discuss the actual Code Stream pipeline.

As stated before, the inspiration was William Lams wonderful Power Shell scripts to deploy a nested environment from a CLI. His original logic was retained as much as possible, however due to the nature of K8S a few things had to be changed. I’ll try to address those as they come up.

After some thought I decided to NOT allow the requester to select the amount of Memory, vCPU, or VSAN size. Each Esxi host has 24G of Ram, 4 vCPU, and contributes a touch over 100G to the VSAN. The resulting cluster has 72G of RAM, 12 vCPUs and a roughly 300G VSAN. Only Standard vSwitches are configured in each host.

The code, pipeline and other information is available on this github repo.

Deployment of the Esxi hosts is initiated by ‘deployNestedEsxi.ps1’. There are few changes from the original script.

The OVA configuration is only grabbed once. Then only the specific host settings (IP Address and Name are changed.
The hosts are moved into a vApp once built.
The NetworkAdapter settings are performed after deployment.
Persisted the log to /var/workspace_cache/logs/vsphere-deployment-$BUILDTIME.log.

Deployment of the vCSA is handled by ‘deployVcsa.ps1’ Some notable changes from the original code include.

Hardcoded the SSO username to administrator@vsphere.local.
Hardcoded the size to ‘tiny’.
Save the log file to /var/workspace_cache/logs/NestedVcsa-$BUILDTIME.log.
Save the configuration template to /var/workspace_cache/vcsajson/NestedVcsa-$BUILDTIME.json.
Move the VCSA into the vApp after deployment is complete.

And finally ‘configureVc.ps1’ sets up the Cluster and VSAN. Some changed include.

Hardcoded the Datacenter name (DC), and Cluster (CL1).
Import the Esxi hosts by IP (No DNS records setup for the hosts or vCenter).
Append the configuration results to /var/workspace_cache/logs/vsphere-deployment-$BUILDTIME.log.

So there you go, down and simple Code Stream pipeline to deploy a nested vSphere environment in about an hour.

Stay tuned. The next article will include an NSX-T deployment.

vRA Cloud Day 2 Resource Action using a Polyglot workflow

One of my peers came up with an interesting use case today. His customer wanted to mount an existing disk on a virtual machine using a vRA Cloud day 2 action.

I couldn’t find an out of the box workflow or action on my vRO, which meant I had to do this thing from scratch.

After a quick look around I found a PowerCli cmdlet (New-Hardisk) which allowed me to mount an existing disk.

My initial attempts to just run it as a scriptable task resulted in the following error.

Hmm, so how do you increase the memory in a scriptable task? Simple, you can’t. Thus I had to move the script into an action, which does allow me to increase the memory. After some tinkering I found that 256M was sufficient to run the code.

function Handler($context, $inputs) {
    # $inputs:
    ## vmName: string
    ## vcName: string (in configuration element)
    ## vcUsername: string (in configuration element)
    ## vcPassword: secureString (in configuration element)
    ## diskPath: string. Example in code. 
    # output:
    ## actionResult: Not used
    $inputsString = $inputs | ConvertTo-Json -Compress

    Write-Host "Inputs were $inputsString"

    $output=@{status = 'done'}

    # connect to viserver
    Set-PowerCLIConfiguration -InvalidCertificateAction:Ignore -Confirm:$false
    Connect-VIServer -Server $inputs.vcName -Protocol https -User $inputs.vcUsername -Password $inputs.vcPassword

    # Get vm by name
    Write-Host "vmName is $inputs.vmName"
    $vm = Get-VM -Name $inputs.vmName

    # New-HardDisk -VM $vm -DiskPath "[storage1] OtherVM/OtherVM.vmdk"
    $result = New-HardDisk -VM $vm -DiskPath $inputs.diskPath 
    Write-Host "Result is $result"

    return "It worked!"
}

Looking at the code, you will notice an input of vmName (used by PS to find the VM). Getting the vmName is actually pretty stupid simple using JavaScript. My first task in the WF takes care of this.

// get the vmName
// $inputs.vm
// output: vmName
vmName = vm.name

The next step was to setup a resource action. The settings are shown in the following snapshot. Please note the setting within the green box. ‘vm’ is set with a binding action.

Changing the binding is fairly simple. Just click the binding link, then change the value to ‘with binding action’. The default values work just fine.

The disk I used in the test was actually a copy of another VM boot disk. It was copied over to another datastore, then renamed to ‘ExistingDisk2.vmdk’. The full diskPath was [dag-nfs] ExistingDisk/ExistingDisk2.vmdk.

Running the day 2 action on deployed machine seemed to work, as the WF logs show.

So there you have a basic PolyGlot vRO workflow using PowerCli and JavaScript.

I trust this quick blog was helpful in some small way.

Changing vRealize Automation Cloud Proxy internal network ranges

My current customer needs to use 172.18.0.0/16 for their new VMWare Cloud on AWS cluster. However we tried this in the past and were getting a “NO ROUTE TO HOST” error when trying to add the VMC vCenter as a cloud account.

The problem was eventually traced back to the ‘on-prem-collector’ (br-57b69aa2bd0f) network in the Cloud Proxy which also uses the same subnet.

Let’s say the vCenters IP is 172.18.32.10. From inside cloudassembly-sddc-agent container, I try to connect to the vCenter. Eventually getting a ‘No route to host’ error. Can anyone say classic overlapping IP space?

We reached out to our VMWare Customer Success Team and TAM, who eventually provided a way to change the Cloud Proxy docker and on-prem-collector subnets.

Now for the obligatory warning. Don’t try this in production without having GSS sign off on it.

In this example I’m going to change the docker network to 192.168.0.0/24 and the on-prem-collector network to 192.168.1.0/24.

First to update the docker interface range.

Add the following two lines to /etc/docker/daemon.json. Don’t forget to add the necessary comma(s). Then save and close.

{
  "bip": "192.168.0.1/24",
  "fixed-cidr": "192.168.0.1/25"
}

Restart the docker service.

# systemctl restart docker

Now onto the on-prem-collector network.

Check to see which containers are using this network with docker network inspect on-prem-collector. Mine had two, cloudassembly-sddc-agent, cloudassembly-cmx-agent.

# docker network inspect on-prem-collector
[
    {
        "Name": "on-prem-collector",
        "Id": "57b69aa2bd0f694d76cc553769321deebcdb79e009e0964c4b5cc47aadb14684",
        "Created": "2021-02-10T16:05:21.953266873Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "05105324cff757d76de9e2f535cfb72d2e96094a630561aa141a40aa04095f00": {
                "Name": "cloudassembly-cmx-agent",
                "EndpointID": "8f6717a969b5a1edfea37b9e3d77565c38419de18774bebf4c3981e41c1ad017",
                "MacAddress": "02:42:ac:12:00:03",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": ""
            },
            "b227cf1add6caca415b88f927fb10982b0cd846f71548f95071b65330e4024e1": {
                "Name": "cloudassembly-sddc-agent",
                "EndpointID": "4f802a81e0a5dfe50ca39675a5b5106a5fb647198f3bfa898f4f62793baad448",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

Disconnect those two machines from the on-prem-collector network.

# docker ps
CONTAINER ID        IMAGE                                                                          COMMAND                  CREATED             STATUS              PORTS                      NAMES

05105324cff7        symphony-docker-external.jfrog.io/vmware/cloudassembly-cmx-agent:207           "./run.sh --lemansDa…"   4 days ago          Up 5 minutes        127.0.0.1:8004->8004/tcp   cloudassembly-cmx-agent

b227cf1add6c        symphony-docker-external.jfrog.io/vmware/cloudassembly-sddc-agent:4cda576      "./run.sh --lemansDa…"   4 days ago          Up 5 minutes        127.0.0.1:8002->8002/tcp   cloudassembly-sddc-agent

# docker network disconnect on-prem-collector b227cf1add6c
# docker network disconnect on-prem-collector 05105324cff7
# docker network inspect on-prem-collector
[
    {
        "Name": "on-prem-collector",
        "Id": "57b69aa2bd0f694d76cc553769321deebcdb79e009e0964c4b5cc47aadb14684",
        "Created": "2021-02-10T16:05:21.953266873Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {},
        "Labels": {}
    }
]

Delete the on-prem-collector network, then re-add using the new subnet (using 192.168.1.0/24)

# docker network rm on-prem-collector
on-prem-collector
# docker network create --subnet=192.168.1.0/24 --gateway=192.168.1.1 on-prem-collector
47e3d477a87c4459f57e3a7305754b1d91e4d13e645ad4c160de5b8e64fede1a

Reconnect the two containers to the new docker network.

# docker network connect on-prem-collector 05105324cff7
# docker network connect on-prem-collector b227cf1add6c
# 
# docker network inspect on-prem-collector
[
    {
        "Name": "on-prem-collector",
        "Id": "47e3d477a87c4459f57e3a7305754b1d91e4d13e645ad4c160de5b8e64fede1a",
        "Created": "2021-05-18T15:58:55.019732144Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "192.168.1.0/24",
                    "Gateway": "192.168.1.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "05105324cff757d76de9e2f535cfb72d2e96094a630561aa141a40aa04095f00": {
                "Name": "cloudassembly-cmx-agent",
                "EndpointID": "34df13b0accf2f561e0226918a7e84d02995a25f4cc3969758a913a3f6c4e8bb",
                "MacAddress": "02:42:c0:a8:01:02",
                "IPv4Address": "192.168.1.2/24",
                "IPv6Address": ""
            },
            "b227cf1add6caca415b88f927fb10982b0cd846f71548f95071b65330e4024e1": {
                "Name": "cloudassembly-sddc-agent",
                "EndpointID": "405e7e8e1a4ad09b4cc99b0661454a4b0f32687152ca2346daf72f5a424dcd4d",
                "MacAddress": "02:42:c0:a8:01:03",
                "IPv4Address": "192.168.1.3/24",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

Reboot and do the happy dance.

Happy not-coding.

VMware vRealize Automation Cloud Secrets

In this blog I’ll explore the new Secret capability in vRealize Automation Cloud. The use case includes the following:

Deploy a CentOS 8 machine
Store the new user password and SSH key in a Secret
Configure the machine using Cloud-Init
- Assign the password and SSH key from a vRA Cloud Secret
Verify the password and SSH key assignment.

First, add two Secrets. Go to Infrastructure -> Secrets, then click NEW SECRET.

The first one will be the SSH key. Find your project, give it a name, then paste in the key. Click CREATE to save the values. Repeat the process for the Password secret.

The Cloud Template is fairly straight forward. The new user password will be assigned the secret.Blog_Password value, and the ssh_authorized_keys comes from secret.Blog_SSH_Key value.

inputs:
  username:
    title: username
    type: string
resources:
  web1:
    type: Cloud.Machine
    networks:
      - name: '${resource.AppNetwork.name}'
    properties:
      image: CentOS 8
      flavor: generic.small
      networks:
        - name: '${AppNetwork.name}'
          network: '${resource.AppNetwork.id}'
      remoteAccess:
        authentication: usernamePassword
        password: changeMe
        username: '${input.username}'
      cloudConfig: |
        #cloud-config
        chpasswd: { expire: False }
        ssh_pwauth: True
        users:
          - default
          - name: ${input.username}
            passwd: ${secret.Blog_Password}
            sudo: ['ALL=(ALL) NOPASSWD:ALL']
            groups: [wheel, sudo, admin]
            shell: '/bin/bash'
            lock_passwd:  false
            ssh_authorized_keys:
              - ${secret.Blog_SSH_Key}
        preserve_hostname:  false
        chpasswd:
          list: |
            ${input.username}:${secret.Blog_Password}
          expire:  False
        runcmd:
          - echo "disable_vmware_customization: false " >> /etc/cloud/cloud.cfg
          - sed -i 's/D \/tmp 1777 root root -/#D \/tmp 1777 root root -/g' /usr/lib/tmpfiles.d/tmp.conf
  AppNetwork:
    type: Cloud.Network
    properties: {}

Now a look at how the secret values are displayed on a deployed machine. Open the deployment, then click on the machine. Expand Cloud Config to view the secret values sent to the machine.

As you can see the values for the new user are encrypted, and do not match stored secret values (The user password is set to VMware1!). Good so far.

Now to see if the password and SSH key actually work. A quick SSH using the key should be sufficient.

Oops. Looks like the key didn’t work, but I was able to login using the password. Time for a bit of troubleshooting. Using elevated permissions (set in Cloud-Init), I take a look at the cloud-init config sent down to the machine.

#more /var/lib/cloud/instance/cloud-config.txt

Hmm, looks the key had a line return in it.

I’ll need to edit/update the Blog_SSH_Key secret. After finding my troublesome secret, I click Edit.

The previously stored value is not viewable, I can only update it.

The new value is viewable until I save it. I made sure this one didn’t have a line return in it. The changes are committed when I click Save.

Now to test the change on a newly deployed machine. I’ll use the same SSH command, with the exception of changing the IP address.

Success! I was able to log in using the key.

In this blog I explored a simple application using two vRA Cloud Secrets, troubleshooting, and updating a secret Value. The VMware developers did a great job. I’m sure the new feature will prove to be very valuable.

I’m not sure when this will get pushed down into vRA 8.x. Please contact your VMware team for more information.

Go forth and succeed.

vExpert 2021 Applications are open

The 2021 vExpert applications are now open!

The program “is about giving back to the community beyond your day job”.

One way I give back is by posting new and unique content here once or twice a month. Sometimes a post is simply me clearing a thought before the weekend, completing a commitment to a BU, or documenting something before moving on to another task. It doesn’t take long, but could open the door for one of my peers.

My most frequently used benefit is the vExpert and Cloud Management Slack channels. I normally learn something new every-week. And it sure does feel good to help a peer struggling with something I’ve already tinkered with.

Here’s a list of some of the benefits for receiving the award.

Networking with 2,000+ vExperts / Information Sharing
Knowledge Expansion on VMware & Partner Technology
Opportunity to apply for vExpert BU Lead Subprograms
Possible Job Opportunities
Direct Access to VMware Business Units via Subprograms
Blog Traffic Boost through Advocacy, @vExpert, @VMware, VMware Launch & Announcement Campaigns
1 Year VMware Licenses for Home Labs for almost all Products & Some Partner Products
Private VMware & VMware Partner Sessions
Gifts from VMware and VMware Partners
vExpert Celebration Parties at both VMworld US and VMworld Europe with VMware CEO, Pat Gelsinger
VMware Advocacy Platform Invite (share your content to thousands of vExperts & VMware employees who amplify your content via their social channels)
Private Slack Channels for vExpert and the BU Lead Subprograms

The applications close on January 9th, 2021. Start working on those applications now.

VMware Cloud Assembly Custom Integration lessons learned

I’ve recently spent some time refactoring Sam McGeown’s Image as Code (IaC) Codestream pipeline for VMware vRealize Automation Cloud. The following details some of the lessons learned.

First off, I found the built in Code Stream REST tasks do not have a retry. I learned this the hard way when they had issues with their cloud offering last month. At times it would get a 500 error back when making a request, resulting in a failed execution.

This forced me to look at Python custom integrations which would retry until the correct success code was returned. I was able to get the pipeline working, but it did have a lot of repetitive code, lacked the ability to limit the number of retries, and was based on Python 2.

Seeing the error of my ways, I decided to again refactor the code with a custom module (For the repetitive code), and migrate to Python 3.

The original docker image was CentOS based and did not have Python 3 installed. Instead of just installing Python 3 thus increasing the size of the image, I opted to start with the Docker Official Python 3 image. I’ll get to the build file later.

Now on to the actual refactoring. Here I wanted to combine the reused code into a custom python module. My REST calls include POST (To get a Bearer Token), GET (With and without a Filter), PATCH (To update Image Mappings), and DELETE (To delete the test Image Profile and Cloud Template).

This module snippet includes PostBearerToken_session which returns the bearToken and other headers. GetFilteredVrac_sessions returns a filtered GET request. It also limits the retries to 5 attempts.

import requests
import logging
import os 
import json 
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

# retry strategy to tolerate API errors.  Will retry if the status_forcelist matches.
retry_strategy = Retry (
    total=5,
    status_forcelist=[429, 500, 502, 503, 504],
    method_whitelist=["GET", "POST"],
    backoff_factor=2,
    raise_on_status=True,
)


adapter = HTTPAdapter(max_retries=retry_strategy)
https = requests.Session()
https.mount("https://", adapter)

vRacUrl = "https://api.mgmt.cloud.vmware.com"

def PostBearerToken_session(refreshToken):
    # Post to get the bearerToken from refreshToken
    # Will return the headers with Authorization populated
    # Build post payload 
    pl = {}
    pl['refreshToken'] = refreshToken.replace('\n', '')
    logging.info('payload is ' + json.dumps(pl))
    loginURL = vRacUrl + "/iaas/api/login"
    headers = {
        "Accept": "application/json",
        "Content-Type": "application/json"
        }
    r = https.post(loginURL, json=pl, headers=headers)
    responseJson = r.json()
    token = "Bearer " + responseJson["token"]
    headers['Authorization']=token
    return headers 

def GetFilteredVrac_sessions(requestUrl, headers, requestFilter):
    # Get a thing using a filter
    requestUrl = vRacUrl + requestUrl + requestFilter
    print(requestUrl)
    adapter = HTTPAdapter(max_retries=retry_strategy)
    https = requests.Session()
    https.mount("https://", adapter)
    r = https.get(requestUrl, headers=headers)
    responseJson = r.json()
    return responseJson

Here is the working Code Stream Custom Integration used to test this module. It will get the headers, then send a filtered request using the Cloud Account Name. It then pulls some information out of the Payload and prints it out. (Sorry for formatting).


 runtime: "python3"
 code: |
   import json 
   import requests
   import os 
   import sys
   import logging
 # append /build to the path
   # this is where the custom python module is copied to
   sys.path.append('/build')
   import vRAC
 # context.py is automatically added.
   from context import getInput, setOutput
   def main():
def main():
   refreshToken=getInput('RefreshToken')
   # now with new model
   # authHeaders will have all of the required headers, including the bearerToken
   authHeaders=vRAC.PostBearerToken_session(refreshToken)
   # test the getFunction with filter
   requestUrl = "/iaas/api/cloud-accounts"
   requestFilter = "?$filter=name eq '" + getInput('cloudAccountName') + "'" 
   # get the cloudAccount by name    
   cloudAccountJson=vRAC.GetFilteredVrac_sessions(requestUrl, authHeaders, requestFilter) 
   # get some specific data out for later
   cloudAccountId = cloudAccountJson['content'][0]['id']   
   logging.info('cloudAccountId: ' + cloudAccountId)
 if name == 'main':
       main()
 inputProperties:      # Enter fields for input section of a task
   # Password input
 - name: RefreshToken
   type: text
   title: vRAC Refresh Token
   placeHolder: 'secret/password field'
   defaultValue: changeMe
   required: true
   bindable: true
   labelMessage: vRAC RefreshToken 

 - name: cloudAccountName
   type: text 
   title: Cloud Account Name
   placeHolder: vc.corp.local
   defaultValue: vc.corp.local
   required: true
   bindable: true
   labelMessage: Cloud Account Name

Next the new Docker image. This uses the official Python 3 image as a starting point. The build file copies everything over (Including the custom module and requirements.txt), then installs ‘requests’.

FROM python:3

WORKDIR /build

COPY . ./
RUN pip install --no-cache-dir -r requirements.txt

Now that the frame work is ready, it’s time to create the pipeline and test it. This is well documented here Creating and using pipelines in VMware Code Stream. Update the Host field with your predefined Docker Host, set the Builder image URL (Docker Hub repo and tag), and set the Working directory to ‘/build’ (to match WORKDIR in the Dockerfile).

Running the pipeline worked and returned the requested information.

This was a fairly brief article. I really just wanted to get everything written down before the weekend. I’ll have more in the near future.

VMWare vRealize Automation Cloud Image Profiles lessons learned

Over the last month I’ve spend most of my time trying to replicate Sam McGeown’s Build, test and release VM images with vRealize Automation Code Stream and Packer in vRealize Automation Cloud. One of the tasks is to create or update an Image Profile. The following article details some of the things I’ve learned.

First off, the Image Mappings shown in the UI cannot be updated via the API directly. The API allows you create, change and delete Image Profiles. An Image Profile contains the Image Mappings and is tied to a Region.

For example, in this image the IaC Image Mappings are displayed.

Here, some of same Image Mappings as seen when you GET the Image Profile by id.

{
"imageMapping": {
"mapping": {
"IaC-build-test-patch": {
"id": "8fc331632163f53fd0c66e0407495504295b4c1c",
"name": "",
"description": "Template: vSphere-CentOS8-CUSTOM-2020.09.25.181019"
},
"IaC-prod-profile": {
"id": "2e2d31be93c59531d2c1eeeadc58f68b66174559",
"name": "",
"description": "Template: vSphere-CentOS8-CUSTOM-2020.09.25.144314"
},
"IaC-test-profile": {
"id": "842c91f05185978d62d201df3b47d1505cf3fea3",
"name": "",
"description": "Generic CentOS 7 template with cloud-init installed and VM hardware version 13 (compatible with ESXi 6.5 or greater)."
}
}
},
"regionId": "71cecc477594a67558b9d5xxxxxxx",
"name": "IaC-build-test-profile",
"description": "Packer build image for testing"
}

But how do you get the Image Profile Id? I ended up using a filter based on the externalRegionId (I used another filtered search to find the externalRegionId by Region Name).

https://api.mgmt.cloud.vmware.com/iaas/api/image-profiles?$filter=externalRegionId eq 'Datacenter:datacenter-xyz'

This returned the following payload (Cleaned up bit). I reference this later as cloudAccountJson.

{
"content": [
{
"imageMappings": {
"mapping": {
"IaC-prod-profile": {
...
},
"IaC-build-test-patch": {
...
},
"IaC-test-profile": {
...
}
},
"externalRegionId": "",
...
},
"externalRegionId": "Datacenter:datacenter-xyz",
...
"name": "IaC-build-test-profile",
"description": "Packer build image for testing",
"id": "fa57fef8-5d0e-494b-b299-7e4a9030ac11-71cecc477594a67558b9d5f056260",
...
],
"totalElements": 1,
"numberOfElements": 1
}

The id will be used later to update (PATCH) the Image Profile.

Now to build the PATCH body to update the Profile. The API has the following body example.

{ "name": "string", "description": "string", "imageMapping": "{ \"ubuntu\": { \"id\": \"9e49\", \"name\": \"ami-ubuntu-16.04-1.9.1-00-1516139717\"}, \"coreos\": { \"id\": \"9e50\", \"name\": \"ami-coreos-26.04-1.9.1-00-543254235\"}}", "regionId": "9e49" }

Then using cloudAccountJson returned previously, I built a new body using the following (partial) code (I couldn’t get the formatting right, hence the image.)

Now some gotcha’s.

First, remember that the image mappings are tied to the region. You will loose any Image Mappings NOT included in the POST/PATCH Body. Make sure you back up the Image Profile settings (Do a get by Image Profile Id) before attempting to change the mappings via the API.

Secondly, an Image Profile does not have a name by default. You need to set this via the API. Why would you need it? Well you may want to find the Image Profile by name later. My current customer creates new customer Image Profiles via the API and uses a ‘tag’ like naming convention.

Thirdly, I’ve experienced several 500 errors when interfacing with the vRA Cloud API. The out of box Code Stream REST tasks do not retry. I ended up writing python custom integrations as a work around. These retry until receiving the correct response code (I’ve seen up to 15 500 errors before getting a 200).

This is just one thing I’ve learned about the vRA Cloud API, and Code Stream. I’ll post more as I have time.