Welcome to Data#3’s five-part series focusing on the detail outlined in our recent blog, which covers our top five tips for the long-term success for Microsoft Azure-hosted infrastructure in your organisation. Data#3 have delivered over 100 Azure Health Checks for customers in the past two years and consistently have found the same problems.
This episode is all about tagging and how you can benefit in many ways by appropriately tagging your resources. It covers both technical and organisation benefits for tagging, and this is a must-read for all audiences.
Tagging in the context of Azure is the application of metadata to each resource to identify essential information that will assist with long-term operational and management excellence. Nearly every resource type in Azure supports tagging, and Microsoft provides a detailed list on all the resource types that support tags.
I recommend using a basic schema for tags and then extending the schema depending on service usage and advanced functionality/reporting requirements. If you over-complicate your schema from the outset, then you may encounter operations fatigue while deploying and maintaining tags.
Different roles within the organisation greatly benefit from tagging from different perspectives. Below are some examples:
Correct. However, the measure of the effort to apply tags during deployment is dramatically less than the effort required to manage a resource that is not tagged. If we are using programmatic deployment methods, then tags can be applied during creation. We are only adding seconds to a deployment and minutes to the thought process for pre-deployment. Compare that to hours of investigation and manual filtering of consumption information to break down cost or attribute a resource to an owner, for example.
Most resource types support tagging on creation within the Azure portal now as per the create Virtual Machine process below.
Azure, create a virtual machine
First, ask what future questions you would like to negate from an audit, cost management, accountability, ownership and incident management perspective. Below is the suggested minimum standard, which should be used as a baseline. Applying these tags greatly enhances your cost management and overall accountability, as well as your change management capability.
After you define the questions you want to negate, then design the schema choosing a global standard of Keys and the type of data that should be used for Values. An example based on the above would be:
Key | Value |
---|---|
Creator | Individual Name |
Owner | Individual Name or Business Unit\Department |
Charge to | Individual Name or Business Unit\Department |
Application Name | Business system or application name |
Status | POC\DEV\TST\UAT\PRD |
The next step is critical. Communicate these requirements to your teams and then implement at least one resource tag application. This will help with standardisation and long-term excellence. The reason for this, is because Tag Keys and Values are free text and case sensitive.
One thing you want to avoid is differences with naming for the Key and an example of this is below. The issue for this is preciseness and consistency. Although the different capitalisation of the Key does not change the functionality of the tag, the display makes it more challenging for the human brain to review the information.
Azure, naming of the key
So how do we prevent different individuals from deviating from the ultimate intent to enforce consistency?
One easy fix is to prepopulate a placeholder resource with the desired naming convention for the tag keys. Unfortunately, you cannot (currently) define a tag schema without using Azure Policy; more on that later.
First, we create a resource group.
Azure, create a resource group
Name the resource group appropriately. This is to prevent resources being deployed by mistake, as this is a temporary resource that can be removed after other resources have had tags applied. Please disregard the error message below, as this is best practice on roles and administrative rights that will be covered in a future episode.
Azure, naming of the resource group
Now add the correct punctuation of your keys and a placeholder value.
Azure, correct punctuation of your keys
After this resource group has been created, then the portal will be able to present the tag keys as drop-down options on resource deployment or tag application of existing resources. Prepopulated keys will prevent free text mistakes long term.
Azure, pre-populated keys
This statement is slightly controversial, and I have been challenged on this quite often. Common responses are:
While in concept these two statements may be valid, there are flaws with this logic.
Tagging at the resource group will allow you to aggregate resource cost for chargeback or show your financial reporting. However, the level of detail will be severely limited. What if multiple resource types within the Resource Group belong to different teams/individuals?
To explain this in detail, I had a customer reach out with a concern over their Azure bill. Their consumption significantly jumped in a 2-week period and was disputing a 43 thousand dollar increase in Azure cost. I was asked to investigate, and I was intrigued by what I saw under Azure Cost Management. The infrastructure deployed was mostly Windows Servers with no platform services. Now with this type of deployment, you would normally expect to see the following breakdown:
Server Compute = 60%
Server storage = 20%
Networking = 10%
Ancillary (Backup/Monitoring/Security Centre/etc. = 10%)
What I saw was Server Storage accounting for 70% of the total cost. No resources were tagged, so it was impossible to start filtering on different storage workloads. After a thorough investigation using Azure Monitor and deep inspection of Virtual Machine configurations, I found that a pool of Citrix servers had a dedicated Standard HDD managed disk allocated to Windows Page File usage. These servers were also memory constrained. They were under a heavy load due to a new shared application that had been deployed approximately 2 weeks prior. With Standard HDD you pay for iOPS and combined with excessive page file activity, this was contributing to a storage cost blow out.
If the disks were tagged as Application: Citrix, it would have been an instant drill down to that tag; therefore eliminating hours of investigation. I am not going to discuss the obvious here in that Azure provides a local SSD dedicated to Windows Page and Linux Swap.
For the second statement, well-dedicated subscriptions for workloads is a management nightmare. For every subscription you deploy, you need to duplicate per subscription services, such as Azure Monitor, Security Center, Azure Backup and Admin privileges. In case you need inter subscription connectivity, then you will incur extra cost for Express Route Circuit authorisation or VNET peering and data transit. This is a thing that you can use to report on and remediate missing or non-compliant tags.
Policy is a crucial starting principal for new Azure customers and is essential for the long term success from a governance and compliance perspective. Out of the 100+ Azure Health Checks Data#3 has performed, only seven customers were using policies, and none of them were effectively using said policy.
I will not going to go into great detail for policies as that is an entire episode in itself. I will advise on looking into the 13 policy definitions related to tags. To access policies, navigate to the subscription and then Policies.
Azure, policy compliance
Then select Assign Policy.
Azure, assign a policy
Select Policy Definition and add tag to the search filter.
Azure, policy definitions
For ease of reading, please refer to the below table of definitions exported from PowerShell.
Azure, table of policy definitions
Policy Display Name | Policy Description |
---|---|
Add a tag to resource groups | Adds the specified tag and value when any resource group missing this tag is created or updated. Existing resource groups can be remediated by triggering a remediation task. If the tag exists with a different value, it will not be changed. |
Add a tag to resources | Adds the specified tag and value when any resource missing this tag is created or updated. Existing resources can be remediated by triggering a remediation task. If the tag exists with a different value, it will not be changed. Does not modify tags on resource groups. |
Add or replace a tag on resource groups | Adds or replaces the specified tag and value when any resource group is created or updated. Existing resource groups can be remediated by triggering a remediation task. |
Add or replace a tag on resources | Adds or replaces the specified tag and value when any resource is created or updated. Existing resources can be remediated by triggering a remediation task. Does not modify tags on resource groups. |
Append tag and its default value | Appends the specified tag and value when any resource which is missing this tag is created or updated. Does not modify the tags of resources created before this policy was applied until those resources are changed. Does not apply to resource groups. New 'modify' effect policies are available that support remediation of tags on existing resources (see https://aka.ms/modifydoc). |
Append tag and its default value to resource groups | Appends the specified tag and value when any resource group which is missing this tag is created or updated. Does not modify the tags of resource groups created before this policy was applied until those resource groups are changed. New 'modify' effect policies are available that support remediation of tags on existing resources (see https://aka.ms/modifydoc). |
Append tag and its value from the resource group | Appends the specified tag with its value from the resource group when any resource which is missing this tag is created or updated. Does not modify the tags of resources created before this policy was applied until those resources are changed. New 'modify' effect policies are available that support remediation of tags on existing resources (see hrefhttps://aka.ms/modifydoc). |
Inherit a tag from the resource group | Adds or replaces the specified tag and value from the parent resource group when any resource is created or updated. Existing resources can be remediated by triggering a remediation task. |
Inherit a tag from the resource group if missing | Adds the specified tag with its value from the parent resource group when any resource missing this tag is created or updated. Existing resources can be remediated by triggering a remediation task. If the tag exists with a different value, it will not be changed. |
Require specified tag | Enforces existence of a tag. Does not apply to resource groups. |
Require specified tag on resource groups | Enforces existence of a tag on resource groups. |
Require tag and its value | Enforces a required tag and its value. Does not apply to resource groups. |
Require tag and its value on resource groups | Enforces a required tag and its value on resource groups. |
Once you have defined your tag schema and remediated missing tags, you can then do the cool things. Another overlooked capability within the Azure portal is the total tag display for your environments. Select All services, enter tag into the search filter and hit the star icon to pin this to the left side menu.
Azure, pin the tags view
Selecting this option from the menu shows you all tagged resources.
Azure, clean up and new owner of resources
Did a recent employee depart the organisation and you need to enumerate all Azure resources deployed by that individual? Then selecting the Owner or CreatedBy:%Persons Name% tag from this view takes you to every resource that they deployed or own. From here, you can cross-reference with other metadata to determine if these resources should be cleaned up or transferred to a new owner.
Azure, display all tagged resources
Other functionality is enabled through tagging that allows for programmatic management of Azure resources, below is a very basic multiple subscription searcher that can filter results on tag values.
programmatic management of Azure resources
Using CloudShell, PowerShell, VSCode, Azure Functions or Azure Automation, we can do some exciting things. Let’s say we tag resources that do not need to run outside of business hours. Then we tag those systems like OperationHours: 9AMto5PM – through automation, we can then enumerate all resources where this tag value exists and initiate a shut-down/de-allocate/scale-down/stop or any applicable operation that will reduce the cost for those resources.
Next, we can enable filtering on Cost Management. This is a standard view that you will see when opening up Cost Management.
Tagging and cost management
Not very useful at all, however when we add a filter, then we can drill down to an individual or set of tags. Here I have filtered the tag CreatedBy and Value ‘David Summers’. From here, we can further refine the criteria by grouping the resource type or other tags.
Tagging and cost management with filter
The successful usage of a suitable tag schema in conjunction with policies greatly assists with the remediation and forced compliance of tags. We can help you to design a suitable tag schema that fits your specific business and technical requirements as well as implementing a set of policies and remediate missing tags.
Stay tuned for the next upcoming episode which will take you through the massive advantages of Azure Advisor. Find more resources in the Data#3 Knowledge Centre.
Contact a Data#3 Azure specialist if you need assistance with Azure Remediation tasks.
Tags: azure cost management, azure metadata, azure policy, azure resources, azure tagging, Microsoft Azure, Public Cloud, Resource Group, Tags