I spy, with my little eye…Encryption at Host in Azure Gov Cloud?

One of the features that has been missing from Azure gov cloud is encryption at host. The restriction of dm-crypt specific to certain Linux operating systems and the cpu overhead using bitlocker makes this a big win, not to forget federal compliances you are trying to achieve. It feels like it is some kept secret and I am not sure why? You still need to access the portal with a special link just to provision with it enabled in commercial cloud. No bicep/arm template examples and a lot of the documentation seems to be from 3rd party blogs. Well, look no further!

I published a quick arm template that enables encryption at host, but before we deploy, we need to make sure the feature is enabled. Check if it is enabled by running Get-AzProviderFeature -FeatureName "EncryptionAtHost" -ProviderNamespace "Microsoft.Compute" and if it is not registered, register it by running Register-AzProviderFeature -FeatureName "EncryptionAtHost" -ProviderNamespace "Microsoft.Compute"

Once the feature has been registered, you can create a VM using this link for gov cloud https://portal.azure.us/?feature.enabledoubleencryption=true&feature.enablehostbasedencryption=true When you get to the disk section, there will be an option to enable encryption at host.

Screenshot of the virtual mahine creation disks pane, encryption at host highlighted.

Using an ARM template is as easy as adding a securityProfile with encryptionAtHost set to true

          "securityProfile": {
              "encryptionAtHost": true

For a complete sample, please go here https://raw.githubusercontent.com/jrudley/vmencathost/main/azuredeploy.json

I haven’t seen any announcements for encryption at host for gov cloud, but then again, I don’t see many for gov cloud to begin with. Hopefully, this makes your FedRAMP and CMMC journey a little easier πŸ™‚

Azure Run Command via API

I had a scenario where I needed an end user to be able to run a few adhoc commands via Azure automation runbook and return the results. I am a big fan of Azure Automation as it has a nice display of the jobs and how it categorizes exceptions, warnings and output. The VM is running Ubuntu, but unfortunately, you cannot run adhoc commands using the Invoke-AzVmRunCommand cmdlet. You need to pass in a script 😦 I tried to do an inline script and also export it out then reference it in the runbook, but it would just display nothing. Knowing that az cli can run adhoc commands, I figured I would research the API.

I was getting no where with the Microsoft docs as the response was not the one I was getting. One simple trick I did was run the web browser developer tools and just monitor the API call being sent from the portal. In the picture below, you can see the API call and the JSON body which has a simple command of calling date. You can copy the API call directly from the devs tools in the specific format you want.

Now that I can make the call, I noticed it is sent asynchronous. Looking at the next call in my dev tools, I saw this URI being called with some GUIDs.

I tried to research this call https://docs.microsoft.com/en-us/rest/api/compute/operations/list#computeoperationlistresult but I didn’t see an explanation for the guid’s in the URI. What I did figure out is that the response from invoke-webrequest has a header key called Location and azure-asyncoperation which both have a URI that matches the call Azure was using in the portal. We can do a simple while loop to wait until the invoke-webrequest populates content which has our stdout from the runcommand. It will look something like this in an Azure runbook:

Connect-AzAccount -Identity

$azContext = Get-AzContext
$azProfile = [Microsoft.Azure.Commands.Common.Authentication.Abstractions.AzureRmProfileProvider]::Instance.Profile
$profileClient = New-Object -TypeName Microsoft.Azure.Commands.ResourceManager.Common.RMProfileClient -ArgumentList ($azProfile)

$token = $profileClient.AcquireAccessToken($azContext.Subscription.TenantId)

$auth = @{
    'Content-Type'  = 'application/json'
    'Authorization' = 'Bearer ' + $token.AccessToken 

$response = Invoke-WebRequest -useBasicParsing -Uri "https://management.azure.com/subscriptions/$($((Get-AzContext).Subscription.Id))/resourceGroups/ubuntu/providers/Microsoft.Compute/virtualMachines/ubuntu/runCommand?api-version=2018-04-01" `
-Method "POST" `
-Headers $auth `
-ContentType "application/json" `
-Body "{`"commandId`":`"RunShellScript`",`"script`":[`"date`"]}"

Foreach ($key in ($response.Headers.GetEnumerator() | Where-Object {$_.Key -eq "Location"}))
       $checkStatus = $Key.Value

$contentCheck = Invoke-WebRequest -UseBasicParsing -Uri $checkStatus -Headers $auth
while (($contentCheck.content).count -eq 0) {
$contentCheck = Invoke-WebRequest -UseBasicParsing -Uri $checkStatus -Headers $auth
Write-output "Waiting for async call to finish..."
Start-Sleep -s 15

($contentCheck.content | convertfrom-json).value.message

As you can see, I am using a managed identity and logging in with it. The runbook calls the runcommand with a POST then it hits a while loop to wait for it to finish then output the results.

Azure Kubernetes Service and Network Security Groups

One of the most common mistakes I see are people modifying the NSG rules for AKS manually instead letting AKS manage it for them. AKS is a managed service, so it will manage the rules. If the NSG rules are manually modified, AKS might reset the rules which could leave your service in a broken state or exposed to threats.

If you look at the annotations for type LoadBalancer https://docs.microsoft.com/en-us/azure/aks/load-balancer-standard#additional-customizations-via-kubernetes-annotations , you can see an annotation for service.beta.kubernetes.io/azure-allowed-service-tags. Typically, we would have some kind of WAF sitting in front, such as Azure Front Door. We can set the service tag AzureFrontDoor.Backend which will let AKS manage this inbound rule of only letting Azure Front Door’s ip’s communicate with this public IP.

We can do a quick example of deploying this YAML which has the service type set to LoadBalancer which will provision us a public ip.

apiVersion: apps/v1
kind: Deployment
  name: aks-helloworld-one  
  replicas: 1
      app: aks-helloworld-one
        app: aks-helloworld-one
      - name: aks-helloworld-one
        image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
        - containerPort: 80
        - name: TITLE
          value: "Welcome to Azure Kubernetes Service (AKS)"
apiVersion: v1
kind: Service
  name: aks-helloworld-one  
  type: Loadbalancer
  - port: 80
    app: aks-helloworld-one

Let’s do a kubectl apply and view the svc.

You can see a public ip has been associated with the svc. Let’s take a look at the inbound NSG. The public ip is open to the internet. I want this svc to be protected by my WAF on Azure Front Door.

In order to apply a tag correctly, let’s modify the yaml to set the correct annotation. In the picture below, I am setting the tag AzureFrontDoor.Backend which AKS will ensure it is always present and managed automatically.

Save the YAML and apply it to update the service.

Viewing the inbound NSG for AKS, we can see it automatically updated the service tag.

Remember, AKS is a managed service. Let it manage the NSGs for you!

Azure Bastion Alternatives

I had a project come up where I needed 2 factor auth and no public IP with RDP access. I instantly thought Azure Bastion would be great for this. I can use conditional access and hit my private IP VMs. Well, the VM had to be Ubuntu running Gnome desktop with xRDP. Azure Bastion is tied to the OS profile where it is SSH for Linux or RDP for Windows. There is an open feedback item to allow RDP to Linux. With all of that being said, let me present… Apache Guacamole. Nothing like presenting to executives saying let’s use Guacamole to solve our issue, haha.

I found an Azure marketplace image from Bitnami that provisions a VM with http to https redirection enabled with some dummy certificates and guacamole installed.

Once you provision the image, it has a public ip already assigned with a nsg on the nic opening ports 80, 443 and 22. I’d modify that nsg to remove port 80 and lock down port 22 to your IP or remove it and just use the serial console. Now, going back to my original requirements of 2fa, there is a saml extension you can use. We can easily create a new saml application in Azure Active Directory as well. Before we do this, we want to make sure we add a new user account with admin permissions in the format of user@aadDomain, else when we browse to the UI with our saml configured, we won’t be able to log in unless we use the API with the default guacadmin account. You can certainly use the API to create new saml accounts in Guacamole, but login first using the guacadmin creds to make it easier for testing. In order to get the default guacadmin password, look here. Make sure you change it!

Login and add a new user with admin permissions. For username, put in the fqdn of the user in AAD. Do not set a password.

Once we log in with the AAD creds, we can delete the guacadmin account.

Get on the Guacamole VM and download the saml extension, tar -xf and copy the jar inside /opt/bitnami/guacamole/extensions. When guacamole is restarted, it will automatically load the jar. We don’t want to restart just yet, as we need to configure the guacamole.properties file with the saml entries. Let’s create a new Azure Enterprise Application and select Create your own application.

Give your app a new name and hit Create.

You will be taken to your new application which you will now select Single Sign On

Select SAML

Edit the basic configuration.

First, modify the Entity ID and Reply URL. We want to put in the FQDN where end users will access it via their browser. I have a domain I mapped to the public IP of blogrds.azuretestingsite.com Hit save and we need to grab the Login URL from #4

Back on the VM, edit /opt/bitnami/guacamole/guacamole.properties file and add these 3 lines:

saml-idp-url: login url from our enterprise saml app

saml-entity-id and saml-callback-url is our fqdn mapped to the public ip

Save this file. The last step is we need a valid certificate for our domain. I already have one and replaced the server.crt and server.key in /opt/bitnami/apache/conf/bitnami/certs. There is also a tool from Bitnami that does Let’s Encrypt for you.

Restart the required services with sudo /opt/bitnami/ctlscript.sh restart

Now, either add the AAD user to the enterprise application or toggle user assignment required to No

Have your user navigate to the FQDN and they will be redirected to auth against AAD.

A couple of things to note. I took this project one more step where you can use an ARM template and set the secrets to a key vault with your certificate. If you have a WAF in front such as Azure Front Door, assign a custom domain name with tls and setup your AAD application to use that FQDN. I have a custom script extension that preps the VM with the steps we did above. For my project, I just pushed the ARM template to Template Specs for quick and easy provisioning.

Azure Devtest Labs “Install Windows Update” not working

I was building out a formula in Devtest labs the other day and added a few artifacts, including “Install Windows Updates”. My goal was to build an ARM template that deploys DTL, creates a VM, then a custom image off that VM. Everything worked great, but as a good IT pro, I double checked my work. Looking at the VM, you can drill down into the artifacts section and see the status of each artifact applied. I had green everywhere, so it looked good. Upon further inspection, I noticed the Windows Update task finished extremely quick.

Clicking on the task to get more details, I saw this:

It just displayed the updates and rebooted. Well, maybe it did install? I checked the update history on the VM and it was empty. It also displayed those updates to be installed. Curious, I went to the PowerShell file in the packages folder C:\Packages\Plugins\Microsoft.Compute.CustomScriptExtension\1.10.12\Downloads\4\PublicRepo\master\0b7a713c381a8cbecf04f92122d1c0b07324871e\Artifacts\windows-install-windows-updates\scripts\artifact.ps1 and saw the cmdlet:

I was able to run this script and reproduce the same output as above. I did some digging and it looks like this cmdlet needs additional parameters now unlike in the past.

I updated it and re-ran the script which displayed:

This is what I would expect to see. Now, the bigger problem is that this is in Microsoft’s artifacts repo https://github.com/Azure/azure-devtestlab/tree/master/Artifacts/windows-install-windows-updates If you are using the public repo for your artifacts and have this specific artifact being consumed, i’d double check to actually make sure you are indeed patching your operating system. I did submit a pull request with the fix, so hopefully they review it soon.

Edit: Microsoft approved my pull request to fix this. Shouldn’t be an issue now πŸ™‚

Azure Gov B2C

While most people consume Azure commercial cloud, Azure gov is another beast. It seems the lack of documentation makes each project a bit more challenging.

I am currently doing a POC for using B2C in gov cloud. B2C supports local accounts which makes it great to put application end user accounts in their own tenant instead of creating my own identity provider. Typically, when you create the B2C tenant, you link it to a subscription. I did not have this option as I could only create just the tenant. It typically looks like this when all is working:

Create a new Azure AD B2C tenant selected in Azure portal

Mine looked like this:

I looked for the feature provider Microsoft.AzureActiveDirectory and it is missing all together. I popped a ticket to Microsoft and they said B2C is supported and you don’t need to link it to a subscription. I was a bit confused because a subscription is a billing boundary and if I used MFA or conditional access, how could it be billed? Well, you can’t do this. After pleading my css case, I was told that this is in preview and engineering knows about this. What stings a bit more is that the Azure feedback item has been open since 2017 😦

Either way, I kept moving forward to see what I can do with this POC. The first thing to call out is that the endpoints are not documented at the docs.microsoft.com site. You must use the endpoints button in your b2c registration you created. What I noticed is that instead of b2conline.com and your typical tenant tld, you need to add .us.

Once I had my endpoints configured correctly, my user flows were working just fine as accounts were being created in the tenant. Now, let me create local accounts in the portal in the b2c tenant. Nope, said could not create user. I am confused why my app could do it just fine. I found the api doc for create user and tested against my commercial sub and it worked, but when I tested against the gov graph endpoint, it failed. It said the property creationType was missing. Alright, so I added creationType=LocalAccount into the json body and the api call worked. Guess this is an azure.portal.us issue.

The last issue I found is that the tenant type is set to Preview tenant. I couldn’t find anything what this meant until I stumbled across the 2016 announcement post. Information about your tenant type is available in your B2C Admin UI. If it says β€œProduction-scale tenant”, you are good to go. If you have an existing β€œPreview tenant”, you must use it ONLY for development and testing. The lack of documentation of what is preview and not is hurting. This sounds like a red flag as I can’t deploy production apps into a preview tenant.

I popped a couple of tickets to Microsoft and will update this post once I get more information. More to come!

Edit: no date yet when it will GA in gov 😦