VM Sizing and Deployment Strategy for Cost Savings for MSPs

Articles
Liz Hoscheid

5 May, 2020
7 min read

Partners frequently ask us to recommend the best VM series size for their use case, or at least the standard for most deployments. Unfortunately, these are difficult questions to answer due to the many variables which must be considered. However, with a basic understanding of the most common VM series, and a few sizing strategies, perfect VM series allocation can be achieved.

In this article, we’ll break down the top three most commonly utilized VM series sizes and explain some key strategies to leverage when sizing new deployments to ensure optimal cost savings.

Most Commonly Utilized VM Series

B-Series

B-series VMs are the minimum we recommend for a test environment. Nerdio deploys all new environments with some B-series VMs. However, this is not necessarily our recommendation for production use. We provision with B-series to help prevent our partners from having a huge Azure bill after their first month in the environment. This is especially important because that first month often consists of configuring and building out the solution, but the client may not even begin migrating into the new environment until after the first month.

B-Series VMs are specifically designed by Microsoft to optimize cost savings. This can be great for your bottom line, but it means they come with some significant limitations.

Burstable CPU Quota & the Credit Bank

The B-series VMs are burstable. This means Microsoft designed them to operate at a baseline (quota). If tasks require more than the base resources, these VMs can burst up to complete the given task. The way this is managed is through a credit bank. Each hour, the B-series VMs accumulate credits during idle time. Once the resources become needed those credits will then be spent executing tasks.

This is great for cost savings, however the downside is when the credits run out. When this happens, performance of the VM slows to the baseline, which can feel like a crawl if end users are attempting to leverage the VM at that time. This can happen very quickly if the B-series VM is applied to a pool where users are logged in and working for hours at a time.

IOPS

There are also limitations when it comes to IOPS. For instance, a B2ms VM is capped at 1920. This means that even if a Standard SSD (6,000 IOPS max) or Premium SSD (20,000 IOPS max) is assigned to that VM, it will never be able to utilize the added IOPS capabilities. We frequently see partners pairing SSD drives with B-series VM’s not knowing this. As a result, they waste money because it’s impossible to improve performance through the SSD drive pair on a B-series. The HDD drives have an IOPS cap of 2000 which means HDD drives are more than enough for the standard B-series VMs.

D-Series

The D-series VMs don’t have the limitations of B-series and their resources are available 100% of the time. We see the D-series VMs used most effectively for servers like FS01, or other LOB servers which leverage resources at a steady rate throughout the day. This includes desktop pools. D-series are effective when applied to pools where users will not leverage large amounts of RAM for their daily tasks. D-series VM’s have a 1:4 Core to Memory ratio.

E-Series

The E-Series VMs are almost identical to the D-Series except they have a 1:8 Core to Memory ration rather than a 1:4. This is good for environments that have users who leverage memory intensive applications, or like to have several browser tabs open at the same time. We find E-Series to be the most common VM series deployed with WVD pools.

Now that we’ve covered the different VM series sizes, let’s talk about use case & deployment strategies.

Use Cases

B-Series Use Case

We most commonly see the B-series VMs applied to the domain controller (DC01). DC01 doesn’t usually perform steady tasks and instead executes bursts of processes throughout the day. As a result, it’s the perfect fit for something like a B2ms. When it comes to WVD pools, we’ve only seen B-series VMs function well with 2-3 very low-level users.

D-Series Use Case

D-series VMs are most commonly applied to FS01. FS01 is the source for FSLogix and manages several tasks including mounting user VHD’s to the various session hosts, folder redirection, and any changes that are made in the user’s desktop, documents, or favorites folders. For pools with 10+ users, these tasks can add up quickly and we often see the credit bank exhausted if FS01 is a B-sereis VM. As a result, we recommend the D-series for FS01 in almost all scenarios. Depending on user count, the D2sv3, D4sv3, D8sv3, or D16sv3 may be appropriate. There isn’t a hard ratio to go by when it comes to resizing FS01 based on user count, but we’ve generally found D2sv3 to work for 10-15 users, D4sv3 for 15-30 users, D8sv3 for 30-60 users, D16sv3 for 60-100+ users. But again, remember that those are rough guidelines. At the end of the day consumption on FS01 should be monitored to ensure resources are not under or over-allocated for the user count.

E-Series Use Case

Like mentioned above, E-series VMs are most commonly seen for desktop pools. Our recommendation, however, is to always test the environment for one to two weeks to make sure the VMs are allocated for optimal cost saving. When quoting E or D-series we recommend initially making the quote based on CPU. That would look something like 2:1 user to core ratio on those servers (as an example). If there are 50 users in an environment, 25 cores should fully accommodate those users. With that in mind, quoting Pool-A with three E8sv3 session hosts would be appropriate, since E8sv3s provide 8 core & 64GB of memory each. If it’s anticipated those users wouldn’t utilize all that memory, then a D8sv3 may be more appropriate. D8sv3 would provide 8 cores & 32GB of memory. Remember E-series has a 1:8 Core to Memory ration while D-series has 1:4 core to memory ratio.

This brings us into our next section, Deployment Strategy.

Deployment Strategy

This section is mostly related to FS01, Dedicated Desktops, & Pooled Session Hosts. However, the principle here can be applied to other servers as well.

One of the most important takeaways from this section is to NOT purchase the Reserved Instances (RI) until after one to two weeks in the new environment. The reason is because it’s hard to know if the environment is appropriately sized until after users are in and working. As an MSP, it doesn’t make sense to intricately monitor user habits prior to migrating into the cloud. As a result, it won’t be well known what type of resources each user leverages on a daily basis. Given this, everything that’s done to size the environment prior to GoLive is just an educated guess. It would be unfortunate to get locked into a 3-year RI only to find out the environment was under or over-specified and a penalty must be paid to Microsoft to get out of the RI contract.

We all know the end user experience is king when it comes to solution adoption. As a result, it is critical to make sure the environment is not only fully dialed in with all the necessary applications and software, but that it’s also been tested for performance. Our recommendation is to, if possible, log in with 50% to 75% of the users in the environment prior to go-live. Make sure to open any LOB applications users will leverage along with any web-based applications and the estimated number of browser tabs they may utilize. Be sure log in/out process is seamless (FS01 sized correctly) and that general performance on the pooled desktops is smooth (Session Host sized correctly). While logged in with users, make sure to either monitor performance with an RMM tool, or log in to each session host and monitor performance via Task Manager. If users are experiencing latency in their session, it might be indicated in CPU maxing out on the VM (upgrade to a larger VM in current series) or Memory (if current VM series is D then upgrade to E). It’ll also allow make it clear if the VMs are overallocated. If all users are logged in and the session hosts never spike above 50%, some cost savings can be achieved by lowering the VM series size.

Final Thoughts

If you’ve followed this guide you should be equipped to size almost any new environment with confidence. We understand that with all the different VM sizes Microsoft offers it can be a bit confusing and overwhelming at first. Don’t worry, though, after just a few deployments you’ll be quoting new environments with ease.

Terminology

Read a full article on Azure terminology, hierarchy, and resources here.

Reserved Instance (RI) – An RI is basically Microsoft’s way of anticipating (as best as possible) the resources that will be utilized in their data center in a given month. As a result, they provide large incentives (sometimes up to 57% off) if partners are willing to commit to specific resources for 1 or 3 years. In the past, the RI was paid up front as a lump sum. However, around the end of 2019 they updated their offering and now RIs can be paid on a monthly basis. This allows for a much smaller up front commitment.

In the event the RI needs to be terminated, Microsoft requires a payment worth 12% of the remainder in the contract. So, using a hypothetical scenario; let’s say the RI was purchased for 3 years and the total cost was $300 for those 3 years. 2 years had passed, and it became necessary to move away from that RI. In this scenario, only $100 would be left in the contract and given the 12% termination fee (of the remainder) only $12 would be owed for the remaining $100.

One last thing to know about RIs. They can be exchanged across Azure region and VM series without penalty. As an example, if a D2sv3 RI was purchased and it became necessary to upgrade the VMs in a pool to D4sv3s , you could simply purchase a second D2sv3 and those two would equate the D4sv3. In the same way four D2sv3 RIs could go towards an RI for a E8sv3, even if the E8sv3 was in a different Azure region than the four D2sv3 RIs.

Azure Hybrid Usage/Benefits (AHU) – AHU is when the partner purchases the Operating System (OS) license, rather than renting it from Microsoft. When AHU is leveraged an addition 20 to 30% savings can be achieved vs. the standard Pay-As-You-Go monthly price. When you combine AHU & RI the cost savings can be up to 80%. This often runs into the tens of thousands when scaled out over 3 years.

The only exception to this is when dealing with B-series VM’s. Microsoft has made these so cheap that to purchase the OS license on these would actually cost more in the long run. As a result, it’s cheaper to just leave these as is and rent the OS on a monthly basis from Microsoft.