Skip to content

Academy

5 Ways MSPs Can Reduce Azure Virtual Desktop Costs with Nerdio Manager for MSP

Let’s get right to the point, the cloud can be expensive, especially if you do not know what to focus on or lack the proper tooling or expertise. If I were to say that you can save up to 70% on Azure compute and storage costs when it comes to Azure Virtual Desktop, would you believe me? Maybe not, and I do not blame you, after all, I am a sales guy as well.

MSPs (Managed Service Providers) would probably “buy it” when talking about large(r) environments with hundreds, if not thousands of users. However, often MSPs manage multiple smaller customer environments where savings as high as 65-70% seem unlikely.

This article outlines five cost reduction options delivered by Nerdio Manager for MSP (NMM)  that lower the costs of a typical MSP-sized AVD deployment by more than 65%. Though typically we see that the bigger the environment gets, the higher the savings will be.

What Costs Are Associated with Azure Virtual Desktop? 

Across our product portfolio we now manage two million users, giving us deep insight and expertise into the typical cost components of using Azure Virtual Desktop.  We have carefully looked at, studied, and analysed several thousand Nerdio AVD deployments ranging from 5-10 users up to 10,000+ users.  Putting it all together, the costs of an AVD deployment can be broken down into five categories: 

  1. Compute (Virtual Machines used as AVD session hosts) – 70% 
  2. OS disks storage (managed disks attached to session host VMs) – 12% 
  3. FSLogix storage (Azure Files or Azure NetApp Files hosting user profiles) – 9% 
  4. Networking (egress bandwidth, VPN gateways, global VNet peering) – 3% 
  5. Other (images, Log Analytics, Azure Automation, backup) – 6%

As you can see, the first three bullets (compute and storage) make up more than 90% of the total. Needless to say, these will be our main topics throughout this article. Other optimizations are possible as well, like with images, for example, though the cost savings will be minimal in comparison.  

To perform the upcoming breakdown and costs analysis we’ll work with a sample use-case based on a real-world scenario. We start by looking at the “unoptimized” costs, which will be calculated on a per-user basis. Then we break down the five cost optimization strategies MSPs can leverage when using NMM – in our sample use case of 32 users, the solution reduces AVD costs by almost 70%! Lastly, we explain in detail how NMM is evolving to add Reserved Instance analytics so MSPs can reduce compute costs even further.

Here are the assumptions we’ve based our calculations on when illustrating the five cost optimization strategies using NMM:

  • Number of users: 32 
    • MSP client environments typically range from 5 to around 50 users, with 30+ being the sweet spot. Of course, there are exceptions, again, more users will mean higher overall cost savings. 
  • User type: Heavy (per-Microsoft definition this means 2 users per vCPU) 
  • Session host VM size: E4s_v4 (common VM used in AVD environments) 
  • OS disk: P10 – 128 GB Premium SSD (common disk size used in multi-session deployments) 
  • FSLogix profile size: 20 GB (stored on Azure Files Premium) 
  • Hours in typical work week: 50 (10 hours per weekday) 
  • Azure pricing: South Central US (list pricing) 

Given the above, the “unoptimized” costs are as follows:

  • Compute: $882 
    • E4s_v4 VMs are needed to support 32 users (8 users per VM) 
  • OS disks: $72 
    • Each of the 4 VMs needs a P10 SSD disk 
  • FSLogix storage: $123 
    • 20 GB per user at $0.19 per GB 
  • Total: $1076 ($33,64/user/month) 

This is how we normally see AVD being deployed. Even without any optimizations applied, the cost per user is $33/month, which is not too bad. However, we can do a WHOLE LOT better than that. Let’s see how low we can go using our five cost optimization strategies.

Strategy #1: VM Power Management 

The compute cost of an AVD session host is by far the largest cost component. Since users will only be using their machines on average for 50 hours per week, there is no need to have them running 24/7. Nerdio can control the starting and stopping of these machines, even when they are all turned off at night and in the weekends, for example. Again, machines will only be running when they need to be running.

Below is the breakdown of how this strategy reduces our unoptimized cost figures. Implementing power management lowers the compute cost component and the total cost per user is decreased by 58% using just strategy #1.

  • Compute: $262 (reduction of $620 or 70%) 
    • 50 work hours is 30% of the total 168 hours in a week. Keeping VMs on 30% of the time means that the remaining 70% of the time we’re saving on VM compute costs. 
  • OS disks: $72 (no change) 
    • Even when VMs are powered off the OS disks are still incurring costs 
  • FSLogix storage: $123 (no change) 
    • Even when VMs are powered off the storage of user profiles is incurring costs 
  • Total: $475 ($14.28/user)

Strategy #2: Burst Capacity, Provisioning VMs Just In Time

When VMs are shut down, you still pay for the attached storage, well the attached OS disk that is. If we remove these VMs, all of them or some, you will no longer have to pay for the associated storage. But what happens if I need them again, you ask? Read on…

For the analysis below, we will assume that 50% of the VMs (2 in our case) will always exist (this is known as base capacity) and the remaining 2 (burst capacity) can be created automatically only as needed and deleted when no longer in use. This is where the Nerdio magic comes in. We ensure that our session hosts are always in a “pristine” state and avoid configuration drift. This is because burst capacity will delete and re-create half of the VMs each day and makes sure that all VMs are being rebuilt from the latest image version on a regular basis. This also includes re-joining it to the domain, for example.

Below is the breakdown of how strategy #2 further reduces our original unoptimized cost figures. This includes cost savings from strategy #1 as well. Total per user cost is decreased by 60% when compared to the unoptimized cost.

  • Compute: $262 (reduction of $620 or 70%) 
    • 50 work hours is 30% of the total 168 hours in a week.  Keeping VMs on 30% of the time means that the remaining 70% of the time we’re saving on VM compute costs. 
  • OS disks$47 (reduction of $25 or 35%) 
    • 2 of the VMs with their OS disks will be deleted when not in use and the OS disks will no longer incur storage costs until the VMs are re-created. 
  • FSLogix storage: $123 (no change) 
    • Even when VMs are powered off, the storage of user profiles is incurring costs. 
  • Total: $432 ($13.49/user) 

Strategy #3:  Auto-scaling OS disks

We have already saved 35% on OS disk storage costs by implementing reduction strategy #2. However, there are still 2 VMs remaining as base capacity (to allow for faster boot up times) and these VMs are not always in a booted state, meaning tuned on. 

When these machines are started an expensive, premium SSD disk (P10) is being used, which is great as we want to take advantage of the excellent performance that comes with it. However, what if we could change this to a much cheaper type of disk when the machine is shut down? With OS Disk Auto-scaling that is exactly what NMM does.

When configuring auto-scale on a host pool, all you have to do is to select the “running OS disk type” and the “stopped OS disk type” You probably guessed it, but when stopped, the Premium SSD will be converted to a Standard HDD, and when the VM is started again the disk will be swapped back to the Premium SDD.

Below is the breakdown of how strategy #3 reduces our original unoptimized cost figures.

  • Compute: $262 (reduction of $620 or 70%) 
  • OS disks$30 (reduction of $52 or 58%) 
    • 2 of the VMs with their OS disks will be deleted when not in use and the OS disks will no longer incur storage costs until the VMs are re-created.   
    • Remaining 2 VMs’ OS disks will be converted to Standard HDD when stopped and back to Premium SSD when started back up. 
  • FSLogix storage: $123 (no change) 
    • Even when VMs are powered off the storage of user profiles is incurring costs. 
  • Total: $415 ($12.97/user) 

When applying and combining just these first three reduction strategies, the AVD costs have been lowered by 61% as compared to an unoptimized deployment. Great, but there is more!

Strategy #4: Shrink OS Disk From 128 GB to 64 GB 

Standard Azure Gallery Windows 10 and 11 images are all 128 GB in size – the OS disk that is. This is a whole bunch of unused storage that you can easily do without= so that you aren’t paying for it. Using one of our Scripted Actions these OS disks can be shrunk down to 64 GB, meaning they are 50% smaller and thus cut storage costs down by 50%.

In a lot of environments (multi-session, pooled desktops) there is no data being saved to the C: drive because all user related data is redirected to an FSLogix profile container, for example. Also, since VMs are being regularly deleted and re-created from the image, there is no growing disk space consumption on the system drive. You can also layer on just-in-time provisioning (strategy #2) and OS Disk Auto-scale (strategy #3) with disk-size reduction. 

Below is the breakdown of how strategy #4 reduces our original unoptimized cost figures. There is one more option to go and we are up to 63% savings already.

  •  Compute: $262 (reduction of $620 or 70%) 
  • OS disks$15 (reduction of $57 or 79%) 
    • 2 of the VMs with their OS disks will be deleted when not in use and the OS disks will no longer incur storage costs until the VMs are re-created.   
    • Remaining 2 VMs’ OS disks will be converted to Standard HDD when stopped and back to Premium SSD when started back up. 
    • OS disk size reduced from default 128 GB to 64 GB for all VMs. 
  • FSLogix storage: $123 (no change) 
    • Even when VMs are powered off the storage of user profiles is incurring costs. 
  • Total: $401 ($12.52/user) 

Strategy #5: Profile Container Whitespace Reduction and Storage Auto-scale 

FSLogix profile containers (storing and offering user profiles) are VHD(X) files stored on a file share. With AVD, Azure Files Premium is often used instead of a traditional file server with file shares, a NAS (network attached storage), or any other type of storage. By default, FSLogix profile containers grow over time and never shrink without strategic intervention because they are thin provisioned. You can probably imagine the unnecessary costs that come with this type of traditional growth.

By removing white space from FSLogix profile containers, Nerdio typically saves up to 50% on FSLogix storage costs. However, reducing space usage alone is not sufficient since Azure Files Premium costs are determined based on provisioned quota, not actual usage. Luckily, we have a separate auto-scale engine to take care of that as well.

Nerdio automatically adjusts the provisioned quota on Azure Files Premium shares based on available free space and storage latency. When latency spikes due to insufficient performance, Nerdio Manager will automatically increase the provisioned quota to increase performance and decrease it when it’s no longer needed. 

Below is the breakdown of how strategy #5 reduces our original unoptimized cost figures.

  •  Compute: $262 (reduction of $620 or 70%) 
  • OS disks$15 (reduction of $57 or 79%) 
  • FSLogix storage$61 (reduction of $62 or 50%) 
    • Storage consumption is reduced by 50% by running scheduled white space removal process. 
    • Performance and free space are balanced with costs using Nerdio Manager storage auto-scaling for Azure Files. 
  • Total: $339 ($10.60/user)

Conclusion

After applying these five cost reduction options  using Nerdio Manager for MSP, the total costs have been reduced by 68%! We started with a typical, unoptimized AVD deployment where we payed $33.64 monthly per user and worked our way down to $10.60 per user, per month, which is quite incredible. To make sure all of our calculations above are in one place and easy to digest, we’ve put it into the overview table below.

Coming Soon: Nerdio Manager for MSP Reserved Instances (RI) Analytics (Cost Reduction Strategy #6!)

Typically, auto-scaling pay-as-you-go (PAYG) VMs save more than using Reserved Instances (RIs) for the VMs involved. But what if you could combine auto-scaling to get the 60-70% compute cost reduction and add RIs to save an additional 50-60% on the remaining compute costs? With Nerdio Manager’s RI Analytics (which will be released soon), you can do just that. Here is what to expect in the product this Spring.

Reserved Instances are typically purchased for all, or almost all, session host VMs in an AVD deployment. However, once compute capacity has been reserved and pre-paid, auto-scaling no longer makes sense from a cost reduction perspective. Therefore, we’ve found a more efficient way to use RIs than reserving all compute capacity. 

First, we would implement auto-scaling to reduce the total number of hours the VMs are turned on. In our example, that’s 50 hours out of 168 each week. In most real-world scenarios, the number of hours is even lower because not all users log in at the beginning of each day and not all users log off completely at the end of the day. The capacity “ramps up” and then “ramps down” with fewer hours when all CPU cores are utilized. 

This is where Nerdio Manager’s RI Analytics comes in. After observing a week or more of auto-scale behavior, NMM will recommend the number of CPU cores to reserve based on actual usage. This means the total number of compute hours is first reduced by auto-scaling and then the cost is further reduced by reservations for those remaining hours.  

As mentioned, more on this in the coming weeks!

Learn more about Nerdio Manager for MSP and get started for free!

Table of Contents

More from the Academy