Home / Nerdio Academy / Microsoft Azure / Azure IT Fundamentals: Azure Storage

Azure IT Fundamentals: Azure Storage

0 commentsApril 29, 2019Videos

Joseph Landes
In this session, you will learn more about one of the most fundamental concepts in Azure, Azure Storage. We’ll dive deeper into topics like Azure Disks, Managed Disks, and Disk Performance. We’ll go through an overview of storage pricing, including pricing Managed Disks, and we’ll discuss the practical differences between Standard and Premium Storage. Understanding Azure Storage is a fundamental building block needed to start building a successful cloud practice in Microsoft Azure. Enjoy the session.

Vadim Vladimirskiy
So Azure Storage, one of the types of storage that you have in Azure is called Azure Disks. Effectively, those are VHD files, so if you think about the Hyper-V environment, which obviously is the parallel to the V-Sphere or ESX environment, in Hyper-V environment, we have VHD or VHDX files that get attached to the individual VM’s as disks. So in Azure, a VHD file can be attached as a disk to a particular VM.

Vadim Vladimirskiy
So as an example, just to show you what that looks like, if we go to a particular virtual machine, and as you recall, we talked about virtual machine consumes Azure [inaudible 00:01:39] per minute, per second, but it is before that [inaudible 00:01:43] what instance or what family of VM you’re using. In order to have a VM, you must have an operating system disk, an OS disk. So let’s click on any of these VM’s. So let’s start with DCL1 as we’ve done before.

Vadim Vladimirskiy
One of the options here is called Disks, and if you click on Disks, you will see that this VM has two disks attached to it. Disk zero, LAN zero is a data disk, LAN one is a data disk, and then there is also an OS disk. So in total, there are actually three disks attached to this particular VM. If you go under Services and search for Disks, you’ll see that there is actual kind of a management interface for Disks that I’ve just added to my Favorites right there. This will give you a list of all the disks available in a particular subscription, resource group, account, what have you. You can limit this by a certain resource group.

Vadim Vladimirskiy
For example, let’s look at NFA 5001. So you can see that I have various disks inside of this account, and you can also see what machine they’re attached to. What VM, to be more specific they’re attached to. So for instance, the three disks that we looked at, one of them being an OS disk, one of them being .. two of them being data disks are attached to DSL1. You can also see the sizes of disks, and you’ll know that these sizes are very discreet. You don’t really have in-between sizes. So you have four gig, eight gig, 16, 32, 64, et cetera, up to I believe the current, I think maybe eight terabytes, or maybe even bigger now. But anyway, they’re discreetly sized disks.

Vadim Vladimirskiy
Okay, now. There are two types of disks in Azure. The older sort of legacy type of disks are called Unmanaged Disks, and I’m going to not really talk about those in much detail. I’ll mention them again when we talk about storage accounts. But the second that’s the more modern and the types of disks that we generalize utilize when deploying Nerdio, and using Nerdio is called the Managed Disk. And everything I’m showing you now Is the result of this disk being a managed disk, meaning it appears as a resource object inside of Azure, whereas unmanaged disks appear kind of differently. There are different things you can do with the two types of objects.

Vadim Vladimirskiy
So let’s click on, let’s say the OS disk of DSL1. Okay. So what do we see here? We see the resource group that it’s in, we see the disk size, and what type it is. In this case, it’s a standard SSD, and there are other types that we’ll talk about. It’s currently attached to DSL1, it is stored in the South Central US region, which happens to be where DSL1 is stored, and it is not possible to attach a disk that exists in another region to a VM from the current region. So those need to be in the same place, even though in theory, you can create a disk in a different region, but you will not be able to use that as the OS or data disk of the VM that’s not in [inaudible 00:05:11].

Vadim Vladimirskiy
Okay. You can do a few things with this disk. You can for instance export it, and make it … when you’re exporting it, what that does is it makes it downloadable. So you can actually points at a URL and download it as a VHD. You can then attach it to a Hyper-V VM locally, you can make your changes, and then put it back and import it as a disk again and then be able to use that disk inside of Azure. And what that’s telling me here is you cannot export a disk, meaning I cannot get a URL to point there and download it unless the VM is de-allocated. So if I stop DCL1, then this disk will become accessible for me to download, but while the VM is running, it is unavailable to me.

Vadim Vladimirskiy
As far as the configuration goes, there’s really not much to configure. But what’s important to keep in mind is this stuff is grayed out right now because it is currently attached to an allocated running VM. I would be able to make these changes if the VM was not running, or if the disk wasn’t attached to a VM. And you can make these changes on the fly, so I could basically use the dropdown and change it from a standard SSD to a premium SSD, and right away power on my VM. When I power on my VM, it is going to have a premium SSD disk, however if you think of it on the backend, it obviously didn’t just magically within a second change from one storage type to another. It actually goes through kind of a background migration process which you don’t really notice, you can’t really track it or see how far along it is.

Vadim Vladimirskiy
Now with each type of disk account type, so Standard SSD, Premium SSD, Standard HDD, et cetera, you have different performance characteristics and estimates, and they are two metrics that are tracked, and we’ll be looking at the specifics in a minute. But for Standard SSD, you get up to 60 megabytes per second of [inaudible 00:07:24], so if you’re doing sustained reads and writes, you can expect performance somewhere in this range. And if you’re doing very small random acts of operations, then you have about 500 IOPS, that is your limit. And the bigger the disk, the more you get with certain storage types. With Standard storage, you don’t really get much more than what’s here, 560. But with Premium, you actually get more as you go up. So as the disk gets more … is bigger and more expensive, they give you more IOPS improvement limits on those disks.

Vadim Vladimirskiy
Let’s jump into onto this page. That is Microsoft Azure Storage Overview and Pricing. There is probably about, what, one, two, three, four, five, six, seven types of storage, but the vast majority aren’t relevant to us at all. So let’s focus on Blobs, Disks and Files. Okay? So for those of you who are familiar with Amazon’s S3, which is the first cloud storage service that came out I think 2005, 2006. It’s basically what’s called Object Storage, where you can take discreet objects, and those may be files, and you can put them into a storage repository via some sort of a rest API or some sort of an interface.

Vadim Vladimirskiy
Okay? So that’s what Blob Storage is. It is the cheapest, other than the Data Lake which is long-term storage, but let’s ignore that for now. So it’s 0.2 of a cent per gigabyte per month to store things as Blobs. Blobs aren’t really useful in an IT context because you can’t really store anything in particular in Blobs without some sort of a interface into Blobs. So you need, for example, Dropbox is an interface for using Amazon’s S3 Object Storage. So without Dropbox, you can’t really leverage S3 directly. I mean there are some tools that can let you drag and drop things in there, but it’s not very common, not very useful in the IT context. It’s really more for applications.

Vadim Vladimirskiy
So if you think about, let’s say, Trello, or Asan, or any of those tools that allow you to attach objects to them, where those objects actually get stored is in Blob Storage on the backend because that’s the cheapest, it gives the provider, in this case Azure, or AWS, the most flexibility on how to treat those files. They don’t have to go into contiguous space, so it’s cheaper to deliver which is the cheapest type of storage available.

Vadim Vladimirskiy
The second one talked about are Managed Disks, right? So it’s kind of the next level of abstraction. The Managed Disk is something you use to attach to VM’s, and attach to VM’s only. So whereas Page Blobs, I’m sorry, a Block Blobs can be accessed directly through some sort of an API, a Disk really can’t be accessed directly. A Disk is basically a file in a format that a virtual machine can connect to as a disk, and that is what’s typically used in an infrastructure, as a service deployment in Azure, with OS and Data Disks being attached to it.

Vadim Vladimirskiy
And then the third type of storage that’s relevant to us is Files. Now Files, the easiest way to think about Files is the ability to create a file share, an SMB/CIFS file share, which means something that looks like a regular file share on the Windows server, except you do not need a Windows server serving up that share. You can actually have Azure serve up that share, and you’ll be able to use a Net Use or a mapped network drive command right from within Windows, assuming the right storage access is available, you’ll be able to map a drive to it, or use it as a UNC, and it’s accessible as a file share, but without the need for a file server. Right?

Vadim Vladimirskiy
So there’s two ways let’s say to present a file share to a user. One would be to have a file server, attach a Managed Disk to it, set up a folder in that Managed Disk, share that folder, and present it to an end user. The other way to do it would be to create a file share right inside of Azure without the need for the virtual machine or a Managed Disk, and present that to your users.

Vadim Vladimirskiy
So these are the different storage types, kind of high level, I don’t even know if the type is the right term in this particular case, but you get the idea right? It’s the types of objects that you can store. Then for each of these objects, or storage types, you have different levels of data redundancy, okay? So the most basic one is called Locally Redundant Storage, or LRS for short, so anywhere inside of a system you’ll see it referred to as LRS. And LRS is designed with [inaudible 00:12:51] of durability. Again, this is not availability, this is durability, which means that within a single year, you have this much of a chance that you will not lose your data, and you have one minus that chance that you will lose it. Okay? So just an interpretation of durability, kind of in an easy … as an easy term.

Vadim Vladimirskiy
So Locally Redundant Storage, for what I understand, is multiple copies of the data, I think it’s three copies of each bit of your data that’s stored in a single data center. So a single physical location that is well connected. So each copy can be accessed at about the same speed from the same virtual machine for instance, and that’s what’s known as Locally Redundant Storage.

Vadim Vladimirskiy
Then there is Zone Redundant and Geographically Redundant. So Geographically Redundant is something that we use pretty commonly, it’s … shorter for it is obviously GRS. You get 69’s of durability over a given year, and you are storing, or they are storing six copies. Three in the data center where you’ve created the storage account, and three more in another data center in a different region that Is paired up. So there is basically an Azure, different regions are paired up with each other. So for any particular region where you store data, there is a pre-defined pair where the Geographically Redundant copies of the data are going to be stored.

Vadim Vladimirskiy
Now you don’t have access to those six copies, meaning it’s something that Microsoft handles for you on the background. You couldn’t say, “Well I want to point my server or my users at a particular version. Either version, a local version or the Geographically Redundant version.” You don’t have that flexibility.

Vadim Vladimirskiy
And then the final type of storage I want to talk about, or redundancy option is called Read-access Geographically Redundant Storage. Same concept as GRS, except you have access to actually decide which location you want to read from, right? So allowing your read-access from the second region used for GRS, uncommon, we don’t typically use it. The two most commonly used one I would say 80% of the time, or maybe even more than 80% of the time, we use LRS. And for certain use cases in NFA for example, enabling a backup, when we enable backup, you get a choice whether you want to set up LRS or GRS, we may use GRS. Okay?

Vadim Vladimirskiy
So these are the redundancy options. Now it’s important to know that not every redundancy option applies to every storage type. Case in point. A Managed Disk can only be LRS, you cannot have a Managed Disk that’s GRS for example. Blobs and Files on the other hand I believe can be either LRS or GRS.

Vadim Vladimirskiy
All right, so then let’s keep going and look at how this is actually implemented. So Managed Disks are the easiest things to understand conceptually, and I showed you what it looks like in the UI. There is a Disks Management module, and all Managed Disks are listed here as what they call First Class Citizens in Azure. They’re objects that you can reference and work with directly. Okay? Now if you want to deal with any other type of storage type, what you have to do is create a storage count first, and the storage account is basically a repository of various storage objects.

Vadim Vladimirskiy
So let’s go through that process. So we’re going to click create storage account, we get to select work subscription and resource group to put it in. We select what name we want to give it, it’s important that this name I believe is geographically unique, that’s why it requires it to be so long. One second, in lower case. Sorry about that. What this little thing is doing is verifying uniqueness, so let’s click test. I be that’s not going to be unique, right? So the reason it’s unique is because a storage account can be exported or presented as a URL, and it’s presented as a URL, that means it has to be a unique URL for each and every storage account. That’s why this has to be unique. Okay, so let’s do that. Okay, that is not unique. Okay let’s 30123. Okay, and again the uniqueness has to be within a particular region because you’ll see that the URL’s that get exported have the name of the region in the URL’s, so that’s why that’s got to be like that.

Vadim Vladimirskiy
Here you select your location, so if we, let’s say we’re in South Central, let’s select South Central US. The next thing you get to select is what kind of storage performance you’re going to have. Either Standard or Premium. Up until maybe two months ago or even a month ago, Standard meant spinning media, Premium meant SSD, they’ve now added a Standard type of SSD as well. So generally, Standard is slower, and Premium is faster. Okay, so if we do Standard, you’ll see that we have account types, there is even the Blob. You can do storage V1 or V2, and there are some differences between the types of V1 and V2, we won’t get into those. But let’s go with V2 ’cause that’s the latest. And then you select what they call replication, or that geo-redundancy, and you can select local, geo or read-access geo. And then you also get to select whether it’s cool or hot. Again, hot means it’s easily accessible and faster, cool means they kind of stage it an put it away somewhere in cold storage which then allows you to pay less, but it takes longer to retrieve.

Vadim Vladimirskiy
So let’s go through the next step. Okay, do we want to be able to secure transfer, okay let’s keep that as enabled. Do you want to allow access from specific networks or all networks? Let’s say all networks. And then if we want to use Data Lake, let’s say no. You can add tags, these are just ways of tracking and labeling things. And then the final thing is you go ahead and you create this account. I’m not going to create it because it will take a minute, I don’t want to waste time. But I’m going to show you the two accounts I already have. So these get provision by default when Nerdio is built. One of them is a Premium account, one of them is a Standard account.

Vadim Vladimirskiy
So if I click on the Standard account, you’ll see a bunch of options that I have here, and there’s something called access keys which means I can actually export these keys and let somebody access my storage account via these keys. There is configuration where you’ll see the performance, you’ll see the secure transfer, the replication, et cetera, encryption, et cetera.

Vadim Vladimirskiy
Now I want to show you here is … there is a Files list here, and I can go ahead and click file share, and it’ll ask me for a name and a quota, how much do I want to allow to be stored in here. And this will actually create a file share. Now until I actually store data in this file share, I’m not paying for consumption because pricing of file is a per gigabyte per month pricing. So I’m really paying for what I use.

Vadim Vladimirskiy
What you’ll notice though is I cannot create a disk directly in here because a disk is always associated with the VM, and it’s dealt with a little bit differently.

Vadim Vladimirskiy
Okay now, how is this stuff priced and how do you pay for it? So let’s look at Managed Disks first. Managed Disks come in, as I mentioned, in very discreet sizes, from 32 gigabytes in Premium, all the way to eight terabytes in Premium now, actually I think it goes bigger now. There is now 32 terabytes you can get. And you can see that the cost of each disk is roughly proportional to the size, so you see it’s roughly doubling every time. It gets a little cheaper as you get bigger, but not much. And then the throughput, you’ll notice is also increasing, and finally it tops out at 750 megabytes of transfer, which is very fast obviously. 20,000 IOPs per disk, and given the size of this disk, you obviously need all that room.

Vadim Vladimirskiy
So that was Premium SSD. That’s storage type number one, they’re called P Disks, and you can see that there’s P4, P6, P10, et cetera. Then there is something called Standard SSD. Those have the letter E in front of them, they go from E10 which is 10 gigabytes, probably all the way up to 32. You can see that they’re cheaper, they’re about half the price. And they have slower, although not, they have about 10% at the top end of IOPS, and I forget what the other one was, but I think it was much more than that. 750 and here’s 500, so not much slower in throughput.

Vadim Vladimirskiy
Okay, so that’s storage type, it’s the Managed Disk number two. So there’s Premium, the fastest most expensive, then there is Standard, Standard SSD, and there is Standard HDD. Those have the letter S in front of them, same sizing, but you’ll notice the performance is pretty flat up until a four terabyte, and then it starts scaling up a little bit. And again, you get similar performance to Standard SSD. The difference between Standard HDD and Standard SSD is SSD’s have a much more smaller variance of performance, so you’ll almost always get something close to those maximums, whereas with HDD, because it’s based on spinning media, there is a lot of variability. So sometimes you get lots of IOPS, sometimes you won’t, that’s why it always says up to as you’ll notice. And with HDD, it’s very true, performance is inconsistent.

Vadim Vladimirskiy
And then the fourth type of storage that isn’t general availability yet, it’s in preview, it’s called Ultra SSD. You’ll notice that the throughput here is quite impressive. 160,000 IOPS, and up to 2,000 megabytes per second. So that’s pretty good. I don’t know the pricing is, I’m sure it’s going to be very expensive. So pricing for Managed Disks is determined by the disk size. So if let’s say, you have currently 512 and you go into the Azure portal and you set it to 513, you’ve now just automatically jumped into a one terabyte disk, you’ll get billed for the entire terabyte regardless of how much you’re using. And then you will have 511 gigabytes unallocated inside of your operating system. So it’s important to remember these are discreet.

Vadim Vladimirskiy
But there is also something you pay for, which is storage operations. Every 10,000 or 100,000 operations, which it could be a read, the list, the leads, what have you, you pay a few pennies. Obviously If storage is very active, then that adds up. If storage is pretty dormant, it doesn’t. And it’s important to remember that the only storage you pay for with operations is Standard Storage. So either Standard HDD, or Standard SSD, you pay for operations, whereas with Premium or Ultra, you do not pay for operations.

Vadim Vladimirskiy
Okay? Now, so Managed Disks again come in discreet sizes that you pay for based on those two metrics. Files and Blobs are built on a data usage type of a scenario. So you either six pennies per gigabyte in LRS, seven and a half for ZRS, or 10 for GRS. And then depending, so you’re not paying for a preset amount, you’re paying for, as you keep adding files, you’re paying a per gigabyte increment. And then you’re also paying for operations, and here is your costs. For LRS, you’re paying one point five cents per 10,000 operations, one point eight and three cents, and then you can also see that there different types of operations that have different prices. And that is it as far as Files go.

Vadim Vladimirskiy
Okay, so again, you publish a file share, you can place files into it, and then you pay for consumption, and for the operations that you consume.