VMWare vRealize Automation Cloud Image Profiles lessons learned

Over the last month I’ve spend most of my time trying to replicate Sam McGeown’s Build, test and release VM images with vRealize Automation Code Stream and Packer in vRealize Automation Cloud. One of the tasks is to create or update an Image Profile. The following article details some of the things I’ve learned.

First off, the Image Mappings shown in the UI cannot be updated via the API directly. The API allows you create, change and delete Image Profiles. An Image Profile contains the Image Mappings and is tied to a Region.

For example, in this image the IaC Image Mappings are displayed.

Here, some of same Image Mappings as seen when you GET the Image Profile by id.

{
"imageMapping": {
"mapping": {
"IaC-build-test-patch": {
"id": "8fc331632163f53fd0c66e0407495504295b4c1c",
"name": "",
"description": "Template: vSphere-CentOS8-CUSTOM-2020.09.25.181019"
},
"IaC-prod-profile": {
"id": "2e2d31be93c59531d2c1eeeadc58f68b66174559",
"name": "",
"description": "Template: vSphere-CentOS8-CUSTOM-2020.09.25.144314"
},
"IaC-test-profile": {
"id": "842c91f05185978d62d201df3b47d1505cf3fea3",
"name": "",
"description": "Generic CentOS 7 template with cloud-init installed and VM hardware version 13 (compatible with ESXi 6.5 or greater)."
}
}
},
"regionId": "71cecc477594a67558b9d5xxxxxxx",
"name": "IaC-build-test-profile",
"description": "Packer build image for testing"
}

But how do you get the Image Profile Id? I ended up using a filter based on the externalRegionId (I used another filtered search to find the externalRegionId by Region Name).

https://api.mgmt.cloud.vmware.com/iaas/api/image-profiles?$filter=externalRegionId eq 'Datacenter:datacenter-xyz'

This returned the following payload (Cleaned up bit). I reference this later as cloudAccountJson.

{
"content": [
{
"imageMappings": {
"mapping": {
"IaC-prod-profile": {
...
},
"IaC-build-test-patch": {
...
},
"IaC-test-profile": {
...
}
},
"externalRegionId": "",
...
},
"externalRegionId": "Datacenter:datacenter-xyz",
...
"name": "IaC-build-test-profile",
"description": "Packer build image for testing",
"id": "fa57fef8-5d0e-494b-b299-7e4a9030ac11-71cecc477594a67558b9d5f056260",
...
],
"totalElements": 1,
"numberOfElements": 1
}

The id will be used later to update (PATCH) the Image Profile.

Now to build the PATCH body to update the Profile. The API has the following body example.

{ "name": "string", "description": "string", "imageMapping": "{ \"ubuntu\": { \"id\": \"9e49\", \"name\": \"ami-ubuntu-16.04-1.9.1-00-1516139717\"}, \"coreos\": { \"id\": \"9e50\", \"name\": \"ami-coreos-26.04-1.9.1-00-543254235\"}}", "regionId": "9e49" }

Then using cloudAccountJson returned previously, I built a new body using the following (partial) code (I couldn’t get the formatting right, hence the image.)

Now some gotcha’s.

First, remember that the image mappings are tied to the region. You will loose any Image Mappings NOT included in the POST/PATCH Body. Make sure you back up the Image Profile settings (Do a get by Image Profile Id) before attempting to change the mappings via the API.

Secondly, an Image Profile does not have a name by default. You need to set this via the API. Why would you need it? Well you may want to find the Image Profile by name later. My current customer creates new customer Image Profiles via the API and uses a ‘tag’ like naming convention.

Thirdly, I’ve experienced several 500 errors when interfacing with the vRA Cloud API. The out of box Code Stream REST tasks do not retry. I ended up writing python custom integrations as a work around. These retry until receiving the correct response code (I’ve seen up to 15 500 errors before getting a 200).

This is just one thing I’ve learned about the vRA Cloud API, and Code Stream. I’ll post more as I have time.

Optionally adding disks with vRealize Automation Cloud

One of the common use cases I see is having the ability to optionally add disks to a machine in vRealize Automation Cloud.

For example, one requester may just want a basic machine with just the OS disk, while another may want several to support SQL Server.

In this article I’ll show you how to add up to four additional disks using an undocumented vRA Cloud function. The other available functions are listed on this VMware documentation page.

Now down to details. What we need to do is create some property bindings for the optional disks, then attach them to the machine using ‘map_to_object’. The grey dashed lines indicate an implicit or property binding in the canvas. Additional information about this kind of bind is available at this VMware documentation page.

Implicit or property bindings

Four inputs are needed, one for each disk. Each disk that is NOT zero size will be created for the machine.

  Cloud_Machine_1:
    type: Cloud.Machine
    properties:
      name: '${input.hostname}'
      image: Windows 2019
      flavor: generic.medium
      attachedDisks: '${map_to_object(resource.Cloud_Volume_1[*].id + resource.Cloud_Volume_2[*].id + resource.Cloud_Volume_3[*].id + resource.Cloud_Volume_4[*].id, "source")}'

Here is the complete YAML for the blueprint.

formatVersion: 1
name: Optional disks
version: 1
inputs:
  hostname:
    type: string
    default: changeme
    description: Desired hostname
  password:
    type: string
    encrypted: true
    default: Password1234@$
    description: Desired password for the local administrator
  ipAddress:
    type: string
    default: 10.10.10.10
    description: Desired IP Address
  disk1Size:
    type: integer
    default: 5
    description: A SIZE of 0 will disable the disk and it will not be provisioned.
  disk2Size:
    type: integer
    default: 10
    description: A SIZE of 0 will disable the disk and it will not be provisioned.
  disk3Size:
    type: integer
    default: 15
    description: A SIZE of 0 will disable the disk and it will not be provisioned.
  disk4Size:
    type: integer
    default: 20
    description: A SIZE of 0 will disable the disk and it will not be provisioned.
resources:
  Cloud_Machine_1:
    type: Cloud.Machine
    properties:
      name: '${input.hostname}'
      image: Windows 2019
      flavor: generic.medium
      remoteAccess:
        authentication: usernamePassword
        username: Administrator
        password: '${input.password}'
      resourceGroupName: '${env.projectName}'
      attachedDisks: '${map_to_object(resource.Cloud_Volume_1[*].id + resource.Cloud_Volume_2[*].id + resource.Cloud_Volume_3[*].id + resource.Cloud_Volume_4[*].id, "source")}'
      networks:
        - network: '${resource.Cloud_Network_1.id}'
          assignment: static
          address: '${input.ipAddress}'
  Cloud_Volume_1:
    type: Cloud.Volume
    properties:
      count: '${input.disk1Size == 0 ? 0 : 1 }'
      capacityGb: '${input.disk1Size}'
  Cloud_Network_1:
    type: Cloud.Network
    properties:
      networkType: existing
      constraints:
        - tag: 'network:vsanready_vlan_14'
  Cloud_Volume_2:
    type: Cloud.Volume
    properties:
      count: '${input.disk2Size == 0 ? 0 : 1}'
      capacityGb: '${input.disk2Size}'
  Cloud_Volume_3:
    type: Cloud.Volume
    properties:
      count: '${input.disk3Size == 0 ? 0 : 1 }'
      capacityGb: '${input.disk3Size}'
  Cloud_Volume_4:
    type: Cloud.Volume
    properties:
      count: '${input.disk4Size == 0 ? 0 : 1}'
      capacityGb: '${input.disk4Size}'

Now to test it. I’ll deploy a machine with four additional disks. Here is the request form with the default disk sizes.

After deploying the machines, you may find the disks did not get added in order. This is known issue. The offshore developers told me ordered addition of disks is not supported at this point (July 2020). Here is a screen shot of the deployed machines. Notice the order, they are not the same as my request.

Out of order disks

In mid July 2020 they released a new vRA Cloud version with additional data for the block-devices. At the time writing this article, the new properties were not included in the block-device model in the API documentation.

They are neatly stashed under customProperties.

    "customProperties": {
        "vm": "VirtualMachine:vm-880",
        "vcUuid": "b24faac3-b21c-4ee9-99f1-c5436d351ecb",
        "persistent": "true",
        "independent": "false",
        "provisionGB": "10",
        "diskPlacementRef": "Datastore:datastore-541",
        "provisioningType": "thin",
        "controllerUnitNumber": "1",
        "controllerKey": "1000",
        "providerUniqueIdentifier": "Hard disk 2"
    }

As you can see they provide the controller number (controllerKey), unit number (controllerUnitNumber), and the provider generated unique identifier (providerUniqueIdentifier).

The idea was to provide this information for those organizations wishing to reorder the disks or even move them to new disk controllers to support their various server deployments.

These additional properties may make into the next version of vRA 8. But who knows what makes the cut.

Until next time, stay safe and healthy.

vRA Cloud Sync Blueprint Versions to Github

The current implementation of vRealize Automation Cloud and Git integration for Blueprint is read only. Meaning you download the new Blueprint version into a local repo the push it. After a few minutes vRA Cloud will see the new version and update the design page. It’s really a pain if you know what I mean.

What I really wanted was to automatically push the new or updated Blueprint when a new version is created.

The following details one potential solution using vRA Cloud ABX actions in a flow on Lambda.

The flow consists of three parts.

  1. Retrieve a vRA Cloud refresh token from an AWS Systems Manager Parameter, then get a refresh token (get_bearer_token_AWS). It returns the bearer token as ‘bearer_token’.
  2. Get Blueprint Version Content. This uses ‘bearer_token’ to get the new Blueprint Version payload and return it as ‘bp_version_content’.
  3. Then Add or Update Blueprint on Github. This action converts the ‘bp_version_content’ from JSON into YAML. It also adds or updates the two required properties, ‘name’ and ‘version’. Both values come from the content retrieved from step two. It also clones the repo, checks to see if the blueprint exists. Then it either creates a Blueprint folder with blueprint.yaml, or updates an existing blueprint.yaml.

The vRA Cloud Refresh Token and Github API key are stored in an AWS SSM Parameter. Please take a look at one of my previous articles on how to set this up.

‘get_bearer_token_AWS’ has two inputs. region_name is the AWS region, and refreshToken is the SSM Parameter containing the vRA Cloud refresh token.

Action 2 (Blueprint Version Content) uses the bearer token returned by Action 1 to get the blueprint version content.

The final action, consumes the blueprint content returned by action 2. It has three inputs, githubRepo is the repo configured in your github project, githubToken is the SSM Parameter holding the Github key, and finally region_name is the AWS region where the Parameter is configured.

Create a new Blueprint version configuration subscription, using the flow as the target action, and filtering the event to “‘event.data.eventType == ‘CREATE_BLUEPRINT_VERSION'”.

Now to test the solution. Here I have a very basic blueprint. Make sure you add the name and version properties. The name value should match the actual blueprint name. Now create a new Version. Then wait until Github does another inventory.

You may notice the versioned Blueprint will show up a second time, now being managed by Github. I think vRA Cloud is adding the discovered blueprints on Github with a new Blueprint ID. The fix is pretty easy, just delete the original blueprint after making sure the imported one still works.

The flow bundle containing all of the actions is available in this repository.

Spas Kaloferov recently posted a similar solution for gitlab. Here is the link to his blog.

Using AWS SSM Parameters with vRA Cloud ABX actions

One common integration use case is to securely store passwords and tokens. In this article I’ll show you how to recover and decrypt an AWS Systems Manager (SSM) Parameter (vRAC Cloud Refresh Token), make a vRA Cloud API call to claim a bearer token, and finally return the deployment name from a second vRA Cloud API call.

I’m not going to discuss how to get the API Token. Detailed instructions are available in this VMware Blog.

I’ll store this token in an AWS SSM Parameter called VRAC_REFRESH_TOKEN as a secure string. Again this is really beyond the scope of this article. Please refer to AWS Systems Manager Parameter Store page for more information.

The following action will need access to this new Parameter. Here I’m creating a new role named blog-ssm-sample-role. I used an inline policy to allow access to every Parameter using these settings.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssm:DescribeParameters"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ssm:GetParameters"
            ],
            "Resource": "*"
        }
    ]
}

You will most likely want to be more granular in a production environment. This role will also need the AWSLambdaBasicExecutionRole.

Now to start building the python ABX Action. This action uses two Default inputs, region_name and refreshToken. Then add requests and boto3 as dependancies. SSM is only available on AWS, so the FaaS Provider is set to Amazon Web Services. And finally set the IAM role to my sample role.

And now the function. It will grab the refresh token from the Parameter store, get a vRA bearer token, get the deployment name, which is returned when the function completes.

import json
import logging
import requests
import boto3


logger = logging.getLogger()
logger.setLevel(logging.INFO)

VRAC_API_URL = "https://api.mgmt.cloud.vmware.com"

def handler(context, inputs):
    '''
    Get secrets 
    '''
    vrac_refresh_token = get_secrets(inputs['region_name'],inputs['refreshToken'])

    ''' 
    get vRAC bearer_token
    work around as the context does not contain auth information for this event
    context.request is responding with Not authenticated
    '''
    bearer_token = get_vrac_bearer_token(vrac_refresh_token)

    '''
    Get the deployment name using deploymentId from inputs
    '''
    deployment_name = get_deployment_name(inputs,bearer_token)

    outputs = {}
    outputs['deploymentName'] = deployment_name
    return outputs


def get_secrets(region,ssm_parameter):
     # Create a Secrets Manager client
    session = boto3.session.Session()
    ssm = session.client(
        service_name='ssm',
        region_name=region)    
    parameterSecret = ssm.get_parameter(Name=ssm_parameter, WithDecryption=True)
    return parameterSecret['Parameter']['Value']


def get_deployment_name(inputs, bearer_token):
    url = VRAC_API_URL + "/deployment/api/deployments/" + inputs['deploymentId'] 
    headers = {"Authorization": "Bearer " + bearer_token}
    result = requests.get(url = url, headers=headers)
    #logging.info(result)
    result_data = result.json()
    deployment_name = result_data["name"]
    logging.info("### deployment name is %s ", deployment_name)
    return deployment_name   


def get_vrac_bearer_token(vrac_refresh_token):
    url = VRAC_API_URL + "/iaas/api/login"
    payload = { "refreshToken": vrac_refresh_token }
    result = requests.post(url = url, json = payload)
    result_data = result.json()
    bearer_token = result_data["token"]
    return bearer_token

Create a new Deployment Complete subscription, using the new ABX action.

Next request a new deployment, waiting until it completes. Then check the Action Run under Extensibility -> Action Runs. If all went as expected you should see the deployment name in the Details -> Outputs section.

This simple use case allows vRA Cloud ABX to recover and use secure data stored in an AWS SSM Parameter.

See you later.

AWS IPAM with vRealize Automation Cloud and InfoBlox Part 2

This is the second part of this series. In this article we will complete the configuration of the InfoBlox, then setup IPAM in vRealize Automation Cloud (vRAC). And finally deploy a two machine blueprint to test the Allocation and Deallocation Lambda functions.

The first thing is to add some attributes required by vRAC within InfoBlox. Click on Administration -> Extensible Attributes. Add the two attributes shown below.

  • VMware NIC index (lower case I), type Integer
  • VMware resource ID, type String

Click on the Add Button, type in the new Attribute name and Type, then click Save & Close. Then repeat for the other Attribute.

Next we need to set up an IPAM range. Here I’m going to create a small range in 172.31.32.0/16. Click on Data Management -> IPAM. Select List, then check the box next to 172.31.32.0/16.

Click Add -> Range -> IPv4.

Add the range following these steps.

  • Step 1, Next
  • Step 2, Enter the range start/end and Range name. Then Next.
  • Step 3, Next
  • Step 4, Next
  • Step 5, Save & Close

Download the InfoBlox Plugin from the VMware Exchange.

Now to add the endpoint in vRAC. Click on Infrastructure -> ADD INTEGRATION. Then click on IPAM.

Click on MANAGE IPAM PROVIDERS.

Then IMPORT PROVIDER PACKAGE, then select the package you downloaded earlier.

The import will take a few minutes. Next select Infoblox from the Provider drop-down box.

Give the Integration a name, select your Running Environment (Cloud Account), Username, Password, and Hostname (IP or hostname. Example 10.10.10.10 or myipam.corp.local. Do not append HTTPS). Then check the box next to Infoblox.IPAM.DisableCertificateCheck. Then the pencil to edit.

Change Value to True to disable the certificate check.

Next Validate the connection and Save it.

Next assign the IPAM range to a vRAC network.

Goto Infrastructure -> Networks, then select the network hosting 172.31.32.0/16. Click the box to the left, then MANAGE IP RANGES.

Select External -> Your Provider -> and your Address space (default). Then check the network hosting your IPAM Range.

Add the network to an existing or new Network Profile.

Now it’s time to test the integration. Here I have a blueprint with two machines. The first will get the next available IP out of the Range (172.31.32.10). The second will be assigned the user requested IP of 172.31.32.20.

formatVersion: 1
inputs: {}
resources:
  Cloud_Network_1:
    type: Cloud.Network
    properties:
      networkType: existing
      name: ipam
      constraints:
        - tag: 'ipam:infoblox_aws'
  Cloud_Machine_1:
    type: Cloud.Machine
    properties:
      image: Ubuntu 18.04 LTS
      flavor: generic.tiny
      remoteAccess:
        authentication: keyPairName
        keyPair: id_rsa
      Infoblox.IPAM.Network.dnsSuffix: corp.local
      # Infoblox.IPAM.createHostRecord: false
      # Infoblox.IPAM.createAddressRecord: false
      # Infoblox.IPAM.Network.enableDns: false
      # Infoblox.IPAM.Network.dnsView: somethingElse
      networks:
        - network: '${resource.Cloud_Network_1.id}'
          assignment: static
          # will assign first available if address is not set
          # address: 172.31.15.11
          assignPublicIpAddress: false
  Cloud_Machine_2:
    type: Cloud.Machine
    properties:
      image: Ubuntu 18.04 LTS
      flavor: generic.tiny
      remoteAccess:
        authentication: keyPairName
        keyPair: id_rsa
      Infoblox.IPAM.Network.dnsSuffix: corp.local
      # Infoblox.IPAM.createHostRecord: false
      # Infoblox.IPAM.createAddressRecord: false
      # Infoblox.IPAM.Network.enableDns: false
      # Infoblox.IPAM.Network.dnsView: somethingElse
      networks:
        - network: '${resource.Cloud_Network_1.id}'
          assignment: static
          # will assign first available if address is not set
          address: 172.31.32.20
          assignPublicIpAddress: false

Deploy the blueprint, then check to see if the Lambda function run. Click on Extensibility -> Action Runs, then change the run type to INTEGRATION RUNS. Then click on the first Infoblox_AllocateIP Action. The assigned IP will be in the Outputs section near the end of the JSON.

{
  "ipAllocations": [
    {
      "domain": "corp.local",
      "ipRangeId": "range/ZG5zLmRoY3BfcmFuZ2UkMTcyLjMxLjMyLjEwLzE3Mi4zMS4zMi4yMC8vLzAv:172.31.32.10/172.31.32.20/default",
      "ipVersion": "IPv4",
      "properties": {
        "Infoblox.IPAM.RangeId": "range/ZG5zLmRoY3BfcmFuZ2UkMTcyLjMxLjMyLjEwLzE3Mi4zMS4zMi4yMC8vLzAv:172.31.32.10/172.31.32.20/default",
        "Infoblox.IPAM.Network.dnsView": "default"
      },
      "ipAddresses": [
        "172.31.32.20"
      ],
      "ipAllocationId": "/resources/network-interfaces/ebef4233-6e94-411d-9f9f-f26096acaa58"
    }
  ]

Looks good so far. Now let’s check InfoBlox. Login, then go to Data Management -> IPAM.

Then check to see the hosts where added to corp.local. Click on Data Management -> DNS -> corp.local. You should see the two new entries.

Now destroy the deployment to make sure the IPAM and DNS entries are cleaned up.

The DNS entries where also removed.

So there you have it, vRAC, AWS and InfoBlox integration.

AWS IPAM with vRealize Automation Cloud and InfoBlox Part 1

The next two articles will discuss how to setup InfoBlox for AWS as an IPAM provider to vRealize Automation Cloud (vRAC). InfoBlox will be hosted in AWS using a community AMI. I’ll be using the latest version (1.0) of the VMware InfoBlox vRA 8.x plugin available on the VMware Solution Exchange, and InfoBlox version 8.5.0 (Any version that supports WAPI v2.7 should work).

Two AWS accounts are needed, one for InfoBlox vDiscovery and the other for vRAC AWS Cloud Account.

First the InfoBlox vDiscovery user, create a role following the directions on page 35 of the vNIOS for AWS document. Then create a new user, and download the credentials.

Secondly, assuming you already have your AWS Cloud Account setup, add the following roles and permissions to your AWS vRAC user.

  • IAMReadOnlyAccess / AWS Managed Policy – Needed when adding the InfoBlox Integration
  • AWSLambdaBasicExecutionRole / AWS Managed Policy – Used by the plugin to run Lambda functions
  • IAM:PassRole / Inline policy – Needed when adding the InfoBlox Integration

Here is a screen shot of my working AWS Policy and Permissions for the vrac user account.

Now on to deploying the InfoBlox for AWS AMI. This deployment requires two subnets in the same availability zone. Detailed installation directions start on page 22 of the NVIOS for AWS document. Make sure to select one of the DDI BYOL AMI’s. I’m using ami-044c7a717e19bb001 for this blog. Here is a screen shot of the search of the community InfoBlox AMI’s.

Some notes on the AMI deployment. 1., Make sure the additional (new) interface is on a different subnet. The management interface (eth1) will need internet access. 2., Assign a Security Group which allows SSH from your local machine, and HTTPS from anywhere.

Take a 10 or 15 minute break as the instance boots and the Status Checks complete. You may use this time to assign an EIP to the ENI assigned to eth1. You can get the Interface ID by clicking on the instance eth1 interface under Instance Description and copying the Interface ID value (at the top of the popup).

Next assign a new or existing EIP to the Network Interface.

Take a 10 or 15 minute break as the instance boots and the Status Checks complete. SSH to the instance as admin with the default password of infoblox. Once logged in you will need to add some temporary licenses (Or permanent if you have them). Add the license options shown in this screen shot. When adding #4, select #2, IB-V825. This will force a reboot.

Give the appliance about 5 minutes before browsing to https://<EIP Address>. Login as admin with the default password of infoblox.

The first login will eventually send you the Grid Setup Wizard. My environment was setup using these settings.

  1. Step 1, Configure as a Grid Master
  2. Step 2, Changed the Shared Secret
  3. Step 3, No changes
  4. Step 4, Changed the password to something more complex than ‘infoblox’
  5. Step 5, No changes
  6. Step 6, Click Finish

Next enable the DNS Resolver in Grid Properties (Click on Grid, click Grid Properties, then add the DNS server under DNS Resolver.

Add a new Authoritative forward-mapping zone under Data Management -> DNS. I’m using corp.local for this article.

Then start the DNS server under Grid -> Grid Manager. Then click DNS, select the grid master, and click the start button.

Now on to discovering the VPC, Subnets and used IPs. Click on Data Management -> IPAM, then click on vDiscovery on the right hand side. I used the following settings.

  1. Step 1, Job Name – AWS. Member infoblox.localdomain (assuming you left everything default when setting up the grid).
  2. Step 2, Server Type – AWS, Service Endpoint – ec2.<region>.amazonaws.com, Access Key ID – <vDiscovery User Access Key>, Secret Access Key <vDiscovery Access Key>.
  3. Step 3, no changes
  4. Step 4, enable DNS host record creation. Set the computed DNS name to ${vm_name}
  5. Step 5, Click Save & Close

Here is a screen shot of my settings for Step 4 (above).

Now to run the vDiscovery. Click the drop down arrow on Discovery and select vDiscovery Manager. Select the AWS Job, then click start.

Hopefully the job will complete in a few seconds (Assuming you have a small environment). My job ran fine and discovered the two VPC’s I have in my Region.

Drilling down into the first Subnet in my default VPC lists the addresses currently in use or reserved. Here I set the filter to show a Status equals used.

This should do for now. The next article will walk through the integration with vRAC, including the deployment of an AWS machine with defined IP, and one with the first available IP in a Range.

Stay tuned.

vExpert 2020

It looks like my efforts in 2019 finally paid off, as I was awarded VMware vExpert 2020 in late February 2020.

I started working on content early last year, publishing close to 10 articles between knotacoder and my employers website.

Plus I signed up for a couple of VMware Design Programs (vRA and vROPS) which also helped give me more items to claim on my submissions.

Thanks to everyone who visited this site since last year. I’m currently working on additional vRAC content even as I type.

vRAC Security Groups revisited

This is follow up to the previous article. A co-worker of mine came up with a better and much cleaner solution.

My original solution worked, but introduced a nasty deployment topology diagram.  In effect it showed every SG as attached, even unused ones. This diagram is very misleading and doesn’t reflect the actual assignment of the Security Groups.

The new solution is much cleaner and more closely represents what the user actually requested.  Here the two mandatory SG’s as well as the required role SG are attached.

The new conceptual code seemed logical, but vRAC just didn’t like it.

formatVersion: 1
inputs:
  extraSG:
    type: string
    title: Select extra SG
    default: nsx:compute_web_sg
    oneOf:
      - title: web
        const: nsx:compute_web_sg
      - title: app
        const: nsx:compute_app_sg
      - title: db
        const: nsx:compute_db_sg
resources:
  ROLE_SG:
    type: Cloud.SecurityGroup
    properties:
      constraints:
        - tag: '${input.extraSG}'
      securityGroupType: existing
  Cloud_Machine_1:
    type: Cloud.Machine
    properties:
      image: RHEL 8 - Encrypted EBS
      flavor: generic.small
      networks:
        - network: '${resource.Cloud_Network_1.id}'
          assignPublicIpAddress: false
          securityGroups:
            - '${resource.ROLE_SG.id}'

After some tinkering I came up the following blueprint.

formatVersion: 1
inputs:
  nsxNetwork:
    type: string
    default: compute
    enum:
      - compute
      - transit
  extraSG:
    type: string
    title: Select extra SG
    default: web
    enum:
      - web
      - app
      - db
resources:
  ROLE_SG:
    type: Cloud.SecurityGroup
    properties:
      name: '${input.extraSG + ''_sg''}'
      constraints:
        - tag: '${''nsx:'' + input.nsxNetwork + ''_'' + input.extraSG + ''_sg''}'
      securityGroupType: existing
  vmOverlay:
    type: Cloud.SecurityGroup
    properties:
      name: NSX Overlay
      constraints:
        - tag: 'nsx:vm-overlay-sg'
      securityGroupType: existing
  WebDMZ:
    type: Cloud.SecurityGroup
    properties:
      name: WebDMZ
      constraints:
        - tag: 'nsx:compute_webdmz'
      securityGroupType: existing
  Cloud_Machine_1:
    type: Cloud.Machine
    properties:
      remoteAccess:
        authentication: keyPairName
        keyPair: id-rsa 
      image: RHEL 8 - Encrypted EBS
      flavor: generic.small
      constraints:
        - tag: 'cloud_type:public'
      tags:
        - key: nsxcloud
          value: trans_ssh
      networks:
        - network: '${resource.Cloud_Network_1.id}'
          assignPublicIpAddress: false
          securityGroups:
            - '${resource.WebDMZ.id}'
            # Adding for NSX Cloud
            - '${resource.vmOverlay.id}'
            # if input.extraSG = "web" then WEB_SG else if imput.extraSG = "app" then APP_DB else DB_SG
            # - '${input.extraSG == "web" ? resource.WEB_SG.id : input.extraSG == "app" ? resource.APP_SG.id : resource.DB_SG.id}'
            - '${resource.ROLE_SG.id}'
  Cloud_Network_1:
    type: Cloud.Network
    properties:
      networkType: existing
      constraints:
        - tag: 'nsx:cloud_compute'

As you can see, there is more than one way to solve use cases with vRAC.  The key sometimes is just to keep trying different options to get the results you want.

vRAC Security Groups lessons learned

A recent vRealize Automation Cloud (vRAC) use case involved applying AWS Security Groups(SG) when deploying a new machine. First, every new AWS machine will be assigned two standard SGs. A third one will be assigned based on the application type (Web, App, DB).

After looking at the Cloud Assembly blueprint expression syntax page, it looked like we would be limited to only two options in our condition (if else). For example, ${input.count < 2 ? "small" : "large"}. Or if input.count < 2 then “small” else “large”.

But we have three options, not two. Effectively we needed.

if (extraSG == 'web' {
then web_sg }
else if (extraSG = 'app' {
then app_sg }
else {
db_sg
}

Or using javascript shorthand.

extraSG == 'web' ? web_sg : extraSG == 'app' ? app_sg : db_sg

Or converted into something vRAC can consume.

${input.extraSG == "web" ? resource.WEB_SG.id : input.extraSG == "app" ? resource.APP_SG.id : resource.DB_SG.id}

Lets see what happens when we deploy this blueprint selecting WEB_SG from the list.

formatVersion: 1
inputs:
  extraSG:
    type: string
    title: Select extra SG
    default: web
    oneOf:
      - title: WEB_SG
        const: web
      - title: APP_SG
        const: app
      - title: DB_SG
        const: db
resources:
  WEB_SG:
    type: Cloud.SecurityGroup
    properties:
      name: WEB_SG
      constraints:
        - tag: 'nsx:compute_web_sg'
      securityGroupType: existing
  APP_SG:
    type: Cloud.SecurityGroup
    properties:
      name: APP_SG
      constraints:
        - tag: 'nsx:compute_app_db'
      securityGroupType: existing
  DB_SG:
    type: Cloud.SecurityGroup
    properties:
      name: DB_SG
      constraints:
        - tag: 'nsx:compute_db_sg'
      securityGroupType: existing
  vmOverlay:
    type: Cloud.SecurityGroup
    properties:
      name: NSX Overlay
      constraints:
        - tag: 'nsx:vm-overlay-sg'
      securityGroupType: existing
  WebDMZ:
    type: Cloud.SecurityGroup
    properties:
      name: WebDMZ
      constraints:
        - tag: 'nsx:compute_webdmz'
      securityGroupType: existing
  Cloud_Machine_1:
    type: Cloud.Machine
    properties:
      remoteAccess:
        authentication: keyPairName
        keyPair: id-rsa
      image: RHEL 8
      flavor: generic.small
      constraints:
        - tag: 'cloud_type:public'
      tags:
        - key: nsxcloud
          value: trans_ssh
      networks:
        - network: '${resource.Cloud_Network_1.id}'
          assignPublicIpAddress: false
          securityGroups:
            - '${resource.WebDMZ.id}'
            # Adding for NSX Cloud
            - '${resource.vmOverlay.id}'
            # if input.extraSG = "web" then WEB_SG else if imput.extraSG = "app" then APP_DB else DB_SG
            - '${input.extraSG == "web" ? resource.WEB_SG.id : input.extraSG == "app" ? resource.APP_SG.id : resource.DB_SG.id}'
  Cloud_Network_1:
    type: Cloud.Network
    properties:
      networkType: existing
      constraints:
        - tag: 'nsx:cloud_compute'

Let’s take a look at the deployment topology from vRAC.

This diagram indicates that all of the SGs where attached to the machine. That can’t be right. I wonder what the machine looks like in AWS.

Hmm, looks like a bug to me.

Now on to leveraging NSX Cloud with vRAC. Stay tuned.

vRAC Ansible Control Host Lessons Learned

Now that I’ve spent a couple of months with vRealize Automation Cloud (vRAC) (AKA VMware Cloud Assembly) I figured it would be a good time to jot down some lessons learned from deploying and using several Ansible Control Hosts (ACH).

My first ACH was an Ubuntu machine deployed on our vSphere cluster. This deployment model requires connectivity to a Cloud Proxy. All of my ACH’s since then have been on an AWS t2.micro Amazon Linux instance. The remaining blog focuses on that environment.

First you should assign an Elastic IP if you plan on shutting it down when not in use. Failing to do so makes your vRAC integrated control host unreachable when it reboots, as the IP will change.

This leaves you with an unusable ACH, with no option but to replace it with a new one. A word of warning here, make sure to delete any deployments that used that ACH. Failing to do so will leave you with an undeletable deployment. Plus you may not even be able to delete the ACH from the integrations page. Oh joy!

Next is the AWS security group. The off shore developers cannot provide a list of source IP’s for any vRAC call into the ACH. Or in other words you have to open your server up to the world. I’ve been told they are working on this, but do not know when they’ll have a fix.

So, I’ve been experimenting with ConfigServer Security and Firewall (CSF). My current test is pretty out of the box and is using two block lists in csf.blocklists. So far so good, as I was able to add the test ACH without an issue, and it is blocking tons of IP’s. The host is using about 180K of RAM with the blocklists in place.

Changing the SSH server port to something other than port 22 doesn’t work right now. I brought this up on my weekly VMware call, so hopefully they’ll get it fixed in the near future.

I’m using an S3 bucket to backup my playbooks once a day. I did have to create an IAM Role with S3 permissions and assigned it to my ACH. I’ll eventually the files pushed up to a repository.

Now on to troubleshooting. One of my beefs is the naming convention of the log files in the ACH users home directory (var/tmp/vmware/provider/user_defined_script/). The only way to figure out which directory to look in is to list them by date (ls -alt) or do a find on the machine IP.

The ACH playbook log directory contains different files depending on the run state. A running deployment contains additional information like what the actual ansible command was called (exec). Exec disappears when the deployment is complete or it errors out, making it almost impossible to look at the ansible command and figure out why it failed. My work around is to quickly jump into the directory and print out ‘exec’.

The ansible command isn’t the only thing that is deleted when a deployment fails, they also delete the host and user variables under /etc/ansible/host_vars/{ip address}. I just keep two windows open, and print out the contents of anything beginning with ‘vra*’. However, ‘log.txt’ in the deployment log folder does contain the extra host variables, but does not contain the user variables.

I’m still figuring out how much space I need in the users home directory for log files. vRAC doesn’t delete folders when destroying a deployment. I suspect an active ACH will eventually fill up the ACH user log files (var/tmp/vmware/provider/user_defined_script/). Right now I’m seeing an average folder size of a bit less than 20k for completed deployments, or about 1G per 60 deployments. And no this isn’t on my grip list yet, but will be.

That’s it for now. Come back soon.