Fluent VMSS API losing protected settings when adding nodes
See original GitHub issueI’m running into issues using the fluent compute API to scale out a VMSS with protected settings in its template.
A complete repro is available here, but I walk through the relevant details below.
I have a virtual machine scale set that includes a custom script extension so that a script will be executed on every new VM instance as it’s provisioned. The custom script is pulled from a private Azure storage blob, so some of the connection information is in ‘protected settings’. Here is the custom script extension from the scale set’s resource manager template:
"extensionProfile": {
"extensions": [
{
"name": "CustomInstallScript",
"properties": {
"type": "CustomScriptExtension",
"publisher": "Microsoft.Compute",
"typeHandlerVersion": "1.7",
"autoUpgradeMinorVersion": true,
"settings": {
"fileUris": [
"https://scriptstash.blob.core.windows.net/scripts/Script.ps1"
]
},
"protectedSettings": {
"commandToExecute": "[concat('powershell -ExecutionPolicy bypass -File Script.ps1 -user ScaleSetUser -password ',parameters('virtualMachineScaleSets_scalesetsample_adminPassword'))]",
"storageAccountName": "scriptstash",
"storageAccountKey": "[Redacted]"
}
}
}
]
}
For this repro, the script that’s executed is just a dummy script that creates a file on disk so that I know it ran.
I then use the following code to increase the capacity of the scale set:
// Login to Azure
var credentials = AzureCredentials.FromServicePrincipal(AzureClientId, AzureClientKey, AzureTenantId, AzureEnvironment.AzureGlobalCloud);
var AzureClient = Azure.Authenticate(credentials).WithSubscription(AzureSubscriptionId);
// Retrieve scale set
var scaleSet = AzureClient?.VirtualMachineScaleSets.GetById(ScaleSetId);
var oldCapacity = scaleSet.Capacity;
// Increment capacity
scaleSet.Update().WithCapacity(oldCapacity + 1).Apply();
This code successfully adds an instance to the scale set, but encounters the following failure during provisioning (and checking the VM, I find the file dropped by the script is missing):
Provisioning failed
Invalid Configuration - CommandToExecute is not specified in the configuration; it must be specified in either the protected or public configuration section
The issue, apparently, is that protected settings from the custom script extension are ignored. If I make my storage blob public and move the CommandToExecute
setting into ‘settings’ instead of ‘protected settings’ everything works as expected.
I’ve found that I can work-around this issue by using Azure PowerShell cmdlets to increase VMSS capacity since that approach doesn’t encounter this error (the custom script runs successfully, even when making use of protected settings), but I’d rather use the fluent API since my work-around pulls in an otherwise-unnecessary PowerShell dependency. For reference, the workaround looks like this:
using (var psInstance = PowerShell.Create())
{
psInstance.AddScript($@"
$clientId = ""{AzureClientId}""
$clientKey = ConvertTo-SecureString -String ""{AzureClientKey}"" -AsPlainText -Force
$Credential = New-Object -TypeName ""System.Management.Automation.PSCredential"" -ArgumentList $clientId, $clientKey
Login-AzureRmAccount -Credential $Credential -ServicePrincipal -TenantId {AzureTenantId}
$vmss = Get-AzureRmVmss -ResourceGroupName {ResourceGroup} -VMScaleSetName {ScaleSetName}
$vmss.sku.capacity = {newCapacity}
Update-AzureRmVmss -ResourceGroupName {ResourceGroup} -Name {NodeTypeToScale} -VirtualMachineScaleSet $vmss
");
psInstance.Invoke();
}
Also, it’s worth noting that this workaround only works if the fluent API hasn’t yet been used to update the scale set (updating with it seems to drop protected settings from the template entirely, so they’re missing in the future, too, not just while the fluent API is working with the scale set).
Issue Analytics
- State:
- Created 7 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
this code fix was released in the recent v1.0.0 release (not beta)
If protected settings are not being applied to VMs spun up as a result of fluent API use, that might explain not being added to the SF cluster. An extension that I think depends on some protected settings is responsible for joining new VMs to the cluster.