Home / Nerdio Academy / Microsoft Azure / Azure IT Fundamentals: Compute, VMs and Core Quotas

Azure IT Fundamentals: Compute, VMs and Core Quotas

0 commentsApril 17, 2019Videos

Joseph Landes:

In this session, you will learn more about the concept of regions and where Microsoft hosts Azure. You will also learn about virtual machines and in Microsoft Azure, there are all kinds of virtual machines. Different series like, A, B, D, N and others. What are they each used for, and why would I, as an MSP, choose to use one versus the other? We’ll also dive deeper into the concept of B series virtual machines, otherwise known as burstable virtual machines. Understanding compute and virtual machines are two more core elements needed to build a strong cloud practice in Microsoft Azure. We’ll take you through examples of each, using the Microsoft Azure portal and Nerdio for Azure, and by the end of this session, you will feel even more prepared to begin spinning up your first customer in Microsoft Azure.


Vadim Vladimirskiy:

So the first fundamental resource in Azure, and resources are sort of the objects that run up the meter and are used in Azure. Some of them are paid, some of them are free, and they all have various licensing and billing models. So we’ll talk about compute, and compute is just a fancy way of saying virtual machines. So, when you look at Azure, and you go into the virtual machines tab right here, what you can see is a list of virtual machines running in a particular environment. And you can see at the top, you can filter this environment. You can filter it also by location.

So there’s a term called region which is the same as the location. And a region is a physical collection of data centers that are geographically close to each other. They’re well inter-connected meaning they have fast network connectivity in between them. And when you deploy a resource, you select what region or location that resource gets deployed into. And different regions have different capabilities. So again, think of the region as a collection of data centers. You don’t know which data center specifically, something is gonna go into, but it will go into one of the data centers.

Okay, so different regions have different capabilities. There is a few useful sites. You can see Azure products by region. Let’s search for that. There is a page that Microsoft puts together products by region which has, which used to be a table, but it’s gotten to be so big that it’s now a search box. For example, if you wanna search for GPU. You wanna know in what location do you have GPU’s available? So you can see here, there are regions listed across the top. There’s a little scroll bar in the bottom and you can find what kind of a VM you want. For instance, we’re gonna get into this, the NV series, which uses then, video grip GPU cards, is available in US, east US, east US two, north central, south central, west US, et cetera but not available, for example, in west, in west central, in Canada, Canada central, et cetera. So when you deploy any resource. So let’s go ahead and add a virtual machine. We’re not gonna actually create one, but I’ll show you what the selections are.

So the first thing you do is you select what subscription it goes into. Because we already logged into a tenant so we have to select the subscription. Then we select our resource group. And then one other selection down here is the region and again, you can see the list of everything that’s here. Microsoft has, I think, about 52 regions or so. My subscription doesn’t have access to some of them.

For example, the government regions require special type of a subscription so we don’t have access to that. But that is the concept of regions. Then the next thing to keep in mind, regarding VM’s, is that unlike in a private cloud environment where we can sort of flexibly decide what type of resource is assigned to what kind of VM. So we can decide how many CPU cores, how many gigs or ram, how much storage, et cetera. In Azure, these things are packaged in very specific, what’s called instances. When we talk about an instance of a VM that is a particular VM size. Let’s look that up. So if you search for Azure virtual machine pricing, they have a pretty handy page that looks like this. And on this page, you will see that they try to group them a little bit, but I’ll go through.

So they try to group them by either all, general purpose, compute optimize, memory optimize, storage optimize, GPU, et cetera. So a few things to point out. So you can see, for instance, this is the B Series VM. So every instance name or instance type will start with the letter B. B stands for burstable in this particular case. It’s a special type of instance. I’m gonna skip it for now because there’s a lot more to say about it and let’s jump down to the D series.

The D series is generally the most popular, standard type of general compute, general purpose instance. This number indicates what version of the CPU. So there used to be version one when it first came out. Then version two, now version three. And you’ll notice for each of the different versions there’s a different CPU chip set that gets utilized. And the later the version, the pricing is different and the functionality is different. But they usually have backwards compatibility so you’ll see D3’s listed at the top. But there’s also D2 that’s still available that you’ll notice will have a different process or architecture.

Okay, so how does this work? So you look at the D instance, it has a number that follows the letter. And the number, in the case of D3, they made it really convenient that this second number happens to indicate how many virtual CPU’s are going to be inside of that VM. And then relative to that CPU, there’s going to be a certain amount of memory. And what you’ll notice is, you can go in. You basically can double, so as you go from one instance to the next one, you doubling the amount of CPU and the ratio between ram and CPU stays constant. So in this case the ratio is one to four. So for every one core you get four gigs of ram. So when you have 16 core’s, you would expect 64 gigs of ram. Okay, so for every instance, for every family, there is a certain CPU to ram ratio.

So for instance, if we look at the V2 series, V2 of D series. You see that the ratio is one to three and a half. So for four we get 14 because it’s three and a half. If you look at something called compute optimize, then you have a ratio of one to two. Because you’re getting more compute for the same amount of ram. So whereas in the D series version three, you would get one CPU for four gigs of ram. Here you’re getting two CPUs. And then the converse is true when you’re looking at memory optimized. Here the ratio is one to eight. So it’s really important to understand that you don’t have this sort of infinite flexibility into how you configure these VM’s. You cannot have one with six cores, for example. Nothing on this list has six cores whereas in the case of V sphere and MBC we certainly could, could have six cores available on the VM.

What the instance family and size determines, is primarily the amount of CPU and ram, and the amount of temporary storage. What do I mean by temporary storage? So in Azure there is always a D drive that’s deployed for Windows VM, usually. And that is labeled as temporary storage. It’s most of the time, it’s a SSD drive for most of the instances. Some families have the non SSD, but we’ll talk about the distinction in later sessions. So there’s always a temporary disc which gets lost every time you deprovision and reprovision a VM. So if you were to shut one down, like stop it, at the Azure level and then start it back up, it would likely start on another host inside of the Azure cloud. And any storage or anything stored in this temporary disc would actually get lost. So it’s important to keep in mind this is good for sort of, scratch space, and page file and things that could go away at the reboot time. But there is no real data that could be stored here.

If you actually click on one of these disks, it will say, data lost warning, there will be a text file there warning you not to store anything. The other thing you’ll notice is I’m gonna switch this pricing from hour to month. Just to make the point. So here’s what you’ll notice. You’ll notice that the cost is roughly proportional to the number of CPU’s. So for instance, as we go from D2 to a D4, we’re doubling the number of CPU’s and we are roughly doubling, or in this case, exactly doubling the cost. And as we go from a four to eight, we’re also again, doubling the cost. So this is true for most series and most VM’s in Azure. The cost is proportional to CPU. The ram is proportional to CPU. So it’s a nice sort of unit to keep in mind that kind of unifies everything.

Okay. Let’s look at a few different instance types. So we have the D series that we talked about. This is kind of general purpose. The oldest and sort of most basic series is called the A series. I think that was the first one that came out. It’s a one to two CPU to ram ratio. They do not support solid states. So they do not support SSD temporary storage. It’s regular spinning media. And they’re fairly inexpensive if you’re doing them as a pay as you go versus doing a reserved instance, which again, we’ll talk about later. You can see whenever there’s a letter M added to the instance name, that means it’s a memory optimized instance. So in this case, if an A2 is a one to two ratio, here’s in A2m. Their ratio is one to eight in terms of CPU to ram. The other common instance types are burstable instances. They’re somewhat complicated to understand but the concept here is that if you log into this VM, you will see that it will either have one CPU, one gig of ram, or eight CPU’s, 32 gigs of ram, but there is a certain amount of quota that is imposed on the performance of this VM. So think of this in the world of VM where as a limit on the CPU consumption that you place on the VM object. So inside of the OS, you’re seeing all eight core’s, but you cannot always use all eight cores.

What does that mean? It means that you get a certain amount, certain fraction of the eight cores you can use in an ongoing basis. Let’s say you get 25%, which means you get fourth, or two CPU’s worth of capability that you can use on an ongoing basis. And then at any time you use less than your quota, your building output’s called credit. It’s called banking credits. And anytime you use anything above your quota, your consuming credits, assuming that you’ve built some credits, or you’ve paid some credits in the past. So what does this specifically mean? Let’s take a look at an example. So the calculation is very complicated, fairly complicated I should say. But we’ve implemented it in the map to really simplify things. So let me give you an example. So here is a B2s, which is two core four gigs of ram. There is a certain amount of quota and what this tells me, basically, Map pulled in information and said over the last 24 hours, we’ve banked 576 credits. We’ll explain that in a second. And we’ve consumed 82 credits. So what this is telling me is that on average, over the last 24 hours, I’ve consumed less CPU than the quota that’s given to this VM which means it’s a good use case for B instance. What you don’t wanna see is you don’t wanna see having zero bank credits and all of them being consumed in the last 24 hours. Because that’s indicating that you are being CPU constrained and you are demanding to use more CPU, or your VM is demanding more CPU than what is available to it.

If you mouse over this little i, it’s going to actually take these number of credits, figure out based on the instant size and the quote in this instance, how many hours of running this instance at 100% capacity, or 100% CPU capacity you have. So six hours is typically the maximum. You can only bank six hours worth of credits over a 24 hour period. And if you don’t use them, they kind of, it’s a rolling, it’s a rolling set of credits that you can bank.

So what does this mean? I have this instance. It’s two cores. I’m using less CPU than is available to me with my quota which means I’m banking credits. And if I were to just launch something on this VM that would use both CPU’s at 100%, it could do that for six hours straight and after that point it could get constrained by the underlined hypervisor down to whatever the quota happens to be. And that’s something that you can just Google quickly. Azure burstable instance quota. And there’s a little table that shows you what that is. The other instances I want to mention are. So I mentioned the D instances version three versus version two. What you will notice is version three has a couple of things. There’s not a single core instance size. It starts at two cores. There also is a one to four ratio between CPU and ram versus one to three and a half. But the most significant difference is that when you are looking at V CPU on the D3 of the D family instances, you’re talking about hyperthreaded CPU’s.

What does that mean? So if you have a single, physical core, with a processor that supports hyperthreading and a hypervisor that supports hyperthreading then that core can be presented as two CPU’s to the VM. And that’s where the version three of the D instance is all about. It’s hyperthreaded cores and as a result you’ll notice that it’s cheaper than the version two even though it has more ram. So let’s make a notice. So D2sV3 which is two by eight, is $137. And if you look at D1, I’m sorry, D2V2, which is a two by seven, is $170, right. So you have a little bit of a premium you’re paying for even though you’re getting less ram with it. Why is that? The reason is that in the version two of the D series, the cores are not hyperthreaded which means these two are actual physical cores on the VM, whereas in the V3, this would actually be four cores if they were hyperthreaded. So that’s a significant differentiator between D2 and D3.

The other popular ones, you know, we rarely see F. We rarely see E, although what’s nice about E is they have a very high CPU to RAM ratio. It’s one to either instead of one to four. So if somebody has a very RAM-hungry workload, let’s say like a terminal server that is not CPU bound, but is ram round, which is generally not the case. Usually CPU is what the constraint is on a terminal and not our VS session. Then in this case, you can see in E2V3 has two CPU’s and 16 gigs of ram. And it’s not much more than a D2 v3 which has the same two CPU’s and eight gigs of ram. See that? And it’s even less than a D2 v2 which is two CPU’s and seven gigs of ram. But those CPU’s are different ’cause they’re actual physical cores rather than hyperthreaded cores.

Okay. Then we’ve got, again, these are just different memory ratios. So here we notice this is one to seven instead of one to three and a half, et cetera. The G series, again, we rarely see. The M series are monsters. So M series, there is this one. You’ll see the price of. So this is the 128 cores and four terabytes of ram at $26,000 a month. I haven’t seen anyone use that one yet but there’s some use cases for it. Okay. Let’s keep going. Because the one I really want to show you is the N series. So anything that starts with an N stands for Nvidia and NC we don’t use. NC v2, we don’t use NC v3 we don’t use. NV we do. And NV are the only ones that can be used to do deliver GPU using this grid 2.0 technology which allows the graphical component to be off loaded to the physical GPU inside of the VM. You can see it starts with a NV6. It has six cores in the minimum and maxes out at NV24 which is four cores. You can see the ratio here. It’s kind of this weird, whatever that is, it’s roughly a one to 10 ish, a little bit less of CPU to ram. They are quite pricey but what we’re really excited about is that these guys are coming out.

And this is gonna be NV version two. You can see here the ram was doubled and the price actually went down. It’s probably half or less than that. So you’re getting more ram for the same amount and CPU with faster CPU and a faster GPU and you’re going to be paying significantly less. NC series and ND series are GPU backed VM’s but they’re used mostly for scientific research and machine learning and the CUDA framework. They’re not really used in a virtual desktop context as much and they can’t really be used in the virtual desktop context because the Windows server 2016 RDS doesn’t support them for GPU offload. Okay the next thing I wanna show you Let’s see. Let’s open up this guy. And so when you look at the VM in Azure. You know there’s all kind of configuration. We’ll go through it at some point. But for now we can see is there is a status. And the status right now is stop, in parenthesis, deallocated. And there are few terms that are kind of used interchangeably that that’s a bit confusing. In Azure a VM could be in one of these states. It could be started. So if I click the start button it would actually start this VM.

When the VM is started, it’s running the meter meaning Azure is billing me for the compute that that VM is consuming. If I stop that VM from the Azure portal, it’s actually gonna put it into stop deallocated state. So again, you can stop a VM in a D in a deallocated state. When a VM is deallocated I’m not being billed for the compute component of this VM but I’m still being billed for its storage. So we’ll talk about that separately but I’m not being billed through the compute. And there is an in between state called stopped by provisioned. Okay. Stopped by provision can happen when you shut the OS down from within the operating system without stopping the VM from the Azure high provider or the Azure control time right here. So in that sense the VM is down, meaning it’s not accessible ’cause the OS is not running but you’re still running up the bill. So in the map, if a situation like that ever occurs what you’ll notice is this will be like a red triangle that’s going to indicate to you the amounts over that it stopped a provision which means that that VM was shut down from the OS and it either need to be started, so it’s useful, or it needs to be stopped, so it gets deallocated.

So core quotas is a protection that Microsoft has put into the environment so that when you spin up a subscription, you are limited as far as how much CPU you can consume from Azure. And the reason for that is kind of obvious. They providing this to you on credit and then you pay for it on consumption. So imagine some hacker steals a credit card, signs up for an Azure subscription, goes in, spends up an m128s, which is $26,000 a month, runs it for two weeks, and then the credit card gets canceled and then Microsoft can’t collect on that amount. So to protect themselves from that they even post something called core quotas. Core quotas is a property of the subscription. So if we go back into our subscriptions here and we go under usage and quotas, what this will tell you, is what my quotas are specifically for compute. And there’s other things that have quotas associated with them. Let’s just select compute in this case BEcause that’s the easiest one to understand. So I have a NV family quota of 24 cores and I’m currently using 18 of them. I have a region quota, meaning in one single region, I cannot use more than 350 and I’m currently using 44. And then you can see the same thing on a per family basis.

So for instance, I have a limit on the number of DS family CPU’s of 350 and I’m currently using two. So the reason these quotas are so high is because I am using an MPN, a Microsoft Network Partner subscription. Those maybe would come with very high quotas. If you sign up for three subscription, you have a quota of four CPU’s which is insufficient to deploy even the most basic NFA environment. Which is why when somebody goes into provision a free subscription in NFA, that it validates how many cores are available and it will not allow them to proceed because there isn’t sufficient amount of quota. You can imagine that whatever the limit is, it’s four for free subscription. It’s 10 for a typical pay as you go subscription. It’s 20 for a CSP subscription. Many times that needs to be increased and the way that’s increased is through this increase request.

Videos in the series