Home / Nerdio Academy / Microsoft Azure / Azure IT Fundamentals: Azure Files and File Sync

Azure IT Fundamentals: Azure Files and File Sync

0 commentsMay 03, 2019Videos

Joseph Landes
In this session, we will continue the discussion on Azure storage by diving deeper into Azure files. We’ll talk about encryption, Bitlocker, Azure Key Vault and Azure File Sync. We’re going to use the white board a bunch in this discussion, so get ready for what will be an exciting talk about another of the many core topics needed to build a successful cloud practice in Microsoft Azure. Enjoy the session.

Vadim Vladimirskiy
In Azure, there’s something called Azure Storage Service Encryption or SSE as an abbreviation that you’ll notice in certain places. This is a low level encryption of the data addressed on storage accounts or disks that is automatic and doesn’t really require any configuration. So what that means is, any time you write data to a storage service, any type of storage block, files, disks, any of the ones we mentioned last time, your data is getting encrypted. So if in theory somebody were to steal a physical hard drive that has some Azure data stored on it and they were to plug it into another computer via USB, the data on the drive would be in an encrypted state, which is what data encryption at rest is all about.

Vadim Vladimirskiy
The important thing to realize is that this data is encrypted with keys that are managed by Microsoft and this is really not something that’s configurable. In the sense that you can’t really control what keys SSE is using to encrypt the data. You cannot turn encryption off. And according to Microsoft, storage service encryption does not affect performance.

Vadim Vladimirskiy
So it’s something that was rolled out not that long ago, maybe this year or maybe last year. But it’s not a trivial thing in the sense that it’s important for lots of data compliance reasons to be able to say the data is encrypted at rest. And Azure storage encryption does that automatically, so any deployments with [inaudible 00:02:22] Azure or into Azure in general benefit from data at rest encryption.

Vadim Vladimirskiy
Sometimes we get asked if there is a way to encrypt data within the Azure environment using customer provided keys. And for those of you who may not be familiar with how data encryption works, but there is a public and a private key. The private key is, the public key is generally provided by … You control it. Let’s say the customer, the partner can control the key on behalf of the customer and then encrypt the data as you place it and decrypt it as you retrieve it. Because Microsoft manages the keys for SSE, they in theory have keys to that data. They could decrypt it. So it’s not really your own and it’s exposed, so to speak, to Microsoft. Usually, again, not a big deal. But important to realize that if customers ask for encryption using their own keys, what they can do is use another service that sounds very similar. But the distinction is it’s called Azure Disk Encryption as opposed to Azure Storage Encryption.

Vadim Vladimirskiy
Disk encryption is designed specifically for ISVMs. So whenever you run the VM, if you recall we talked about, you must have an OS disk. Typically that’s a manage disk. [inaudible 00:03:58] talk about unmanaged disks, which I think those are not as relevant anymore so we’re not going to talk about them. But managed disks can be encrypted and those get encrypted with Bit Locker at the OS level using encryption keys that could be customer provided or could be custom keys, which are then stored within this thing called Azure Key Vault. And then Azure Key Vault is a service in Azure that’s a secure repository for various types of keys. And one of the use cases for Azure Key Vault is to store the encryption keys, which can then be used to encrypt a disk attached to a VM, which is running on top of Azure storage, which is already encrypted.

Vadim Vladimirskiy
So what are you getting as a benefit of doing disk encryption? You’re getting the use of your own keys that Microsoft isn’t aware of because you’re providing them and you’re storing them in your own key vault which, in theory, is not accessible to anybody but the administrator who sets up the key vault and the keys that are with [inaudible 00:05:10].

Vadim Vladimirskiy
The thing to keep in mind about disk encryption is as I mentioned, it’s done using Bit Locker which is a feature of Windows. So it’s something that runs within the operating system and encrypts the disks at the OS level. And it is something that can be triggered with Power Shell from within the Azure administration portal. You can do it on command line from outside of the VM. You don’t have to be inside of the VM to do it. But it is a feature of Windows that’s the same Bit Locker that is used if you have a laptop and you want to encrypt the data on that laptop so in case it gets lost, the data on it is unreadable, would be the same type of functionality as a standard feature in Windows.

Vadim Vladimirskiy
One thing to show you in the UI of Azure, if you go into a VM and you look at its disks, what you’ll see is there’s going to be a column for encryption and you’ll see that it says it’s not enabled. And the reason it’s not enabled is not because this data at rest is not encrypted. Because it is encrypted with storage service encryption. What it’s not enabled for is Azure disk encryption. So these disks are not encrypted with a custom key. It is encrypted only at the storage level.

Vadim Vladimirskiy
Let’s look at Azure Files. So we talked about Azure Files. It’s the ability to configure a SMB file share inside of Azure. So let’s draw that out. Let’s say we have Azure Files and let’s say we have a share called Public inside. So we have a share called Public that is being shared out of Azure Files that’s running inside of Azure. And that share can be connected either through, let’s call this FSO One or maybe even directly by desktop. So these Windows machines can map the drive, and let’s say they call P and P, and they can basically map the drive directly to this. It’s going to be something like slash slash, some sort of a URL, slash some sort of an ID, slash Public, is going to be the way this is going to be presented.

Vadim Vladimirskiy
So you can simply run the command and as long as you have a network level access into this share, you’re able to map it as a drive. The important thing to keep in mind as far as limitations go for Azure files is that Azure files currently are limited to being in standard storage only. And as you recall, standard storage is based on spinning media, so HDDs, not SSDs. It is something that’s currently limited preview to allow Azure Files to be in premium storage, which is going to be much faster, higher, through put [inaudible 00:08:19] more Iops, etc. But being in standard storage means that if you are, for instance, mapping to Azure Files from AVM that’s running inside of Azure, let’s say it’s in the same region, so physically close to each other, you have a VM called FSO One and you have an Azure File shared called Public, if you map from one to the other, the networking is going to be relatively fast. Because they’re in the same region. But the disk performance may not be that fast because it’s standard storage.

Vadim Vladimirskiy
And if you imagine having to map to an Azure File share outside of the same Azure region, so either in another Azure reason or from an on prime system from, let’s say from a laptop that you’re traveling with, then you have two constraints. You both have standard storage, which gives you fairly slow disk performance, and you now have also some network latency, which wouldn’t be different than mapping a network share or a drive to a network share in a file server that’s somewhere across a [inaudible 00:09:25] connection.

Vadim Vladimirskiy
So same types of limitations. The other limitation of Azure Files is the fact that at the current time, it is not integrated with active directory. So what does it mean it’s not integrated with active directory? The typical way you use Azure, you use File shares, is you present a share that’s open to everyone with full control, and then you’re using NTFS with access lists on a per group or maybe per user basis to segment who has access to what.

Vadim Vladimirskiy
As you can imagine, that type of access list and DFS control requires being aware of the account IDs or the group IDs within an active directory. And Azure Files currently does not support active directory integration, which means that the only way to limit access is on a per share basis, not per folder or file within a share, and also to do so based on the source connector and not on an individual user that’s logging in to that particular file share.

Vadim Vladimirskiy
So those are the limitations. It is currently, I think it’s maybe still unlimited preview, maybe in public preview or maybe even being GA. But I think they now have a feature that integrates Azure Files with something called Azure Active Directory Directory Services, which is kind of a way of being able to use your existing active directory and manage Azure files and other similar services to be familiar or aware of the security IDs of the various accounts and group identifiers so you can limit access based on NTFS permissions. Okay? So that’s kind of Azure Files at a high level. Not super useful in its current state for typical deployment. Why? Because imagine we’re using something like public or even users where we’d redirect desktop and documents folders to or a UPD, a User Profile Disk. Imagine if you’re doing that into standard storage. The performance wouldn’t be good enough to where the user performance on the desktop wouldn’t suffer as the result. So imagine all of the desktop items are redirected to an Azure file share. Even if it’s in the same Azure region, you have fairly slow disk performance that you can’t really increase.

Vadim Vladimirskiy
So if users start working in large files, opening large files, saving large files, they may run into those performance limitations. With Azure Files being compatible with premium storage, that’s going to go away. It’s going to be a little more expensive. But it’s going to eliminate that limitation. But the other significant limitation is, whenever we, again, using as an example, desktop and documents folders, when those get redirected, they’re always limited to the user who needs to have access. Public shares’ generally open to the domain users, but department shares and user shares are usually limited on a per user basis and that is not something that’s currently possible in Azure Files without using this ADDS, which is kind of a different service and add on service.

Vadim Vladimirskiy
So, how do you make use of Azure Files? And the way you can make use of Azure Files is using something called Azure File Sync. So it sounds similar to Azure Files, but it’s actually a feature that runs on top of Azure Files and does the following. So imagine you have an Azure Files share, call it public. In front of it, so imagine you have an Azure Files Sync service, which is an Azure service, so Azure Files Sync service, sitting in front of Azure Files. And then what you do is, you can create sync services, let’s call it FSO One, and let’s say there is another one that’s maybe on premise. Let’s call it File 01. So let’s say this is on the prime and this is in Azure. So what these servers are, they’re called sync servers, and they can now communicate with Azure File Sync, and they can synchronize the contents of this share with an equivalent share inside of these VMs.

Vadim Vladimirskiy
Now, the great thing about this is that the share inside of these VMS is presented to the users. So let’s say we now have a double UF01 here. This double UF01 is a client of this file server. So this file server can present public as a regular folder and the regular share inside of its own disk and operating system with all of the regular NTFS capabilities and it will be consistent as it syncs to other servers. So it’s a way of kind of creating a cache of the Azure files’ data locally in the file server that’s close to the clients that are accessing that data. So it makes it faster and also allows for NTFS permissions because it’s active directory where this is just regular file services.

Vadim Vladimirskiy
So for instance, if you have two or even let’s say three, let’s say you have FSO Two in a completely different Azure region also connected to this file share, any time a change is made here, this change is going to get replicated here and here through this common Azure File Sync service, which is accessing data that’s stored in Azure File share. Now what you could also do is, you can map and make a change directly on the Azure files. So let’s say you make a change up here. That change within 24 hours, because it’s got to detect the change has been made, but within 24 hours, there’s a change detection job that will run and that change will appear on every sync server that is currently syncing to that file share through Azure File Sync.

Vadim Vladimirskiy
Now when would you want to use something like this? Why not just use a single file server, set up the data on the disk, share it out with file sharing like it is traditionally since it is something that has to be done anyway? We still have a file server involved so why not just do that traditionally? Why did we need all this complexity of having Azure Files, Azure Files Sync, and then all of these different things going back to it? Well the reason is, that Azure Files, especially when stored on standard storage, is very cheap. So the cost of storage starts and goes lower than eight cents per gigabyte per month, right? Which is really cheap.

Vadim Vladimirskiy
Now if you imagine you have a customer that has a huge amount of data, let’s call it 10 terabytes or something, they can put it into Azure File Sync and 10 terabytes will cost you, what is that? That’s eight dollars per terabyte, so that’s 80 dollars. Right? Then 100, more than that. It’s 800 dollars. But still, that’s very cheap if you think about how much storage that is. So you can place all of that storage into Azure Files and then set up sync servers or a single sync server that can be the cache into that data. So when you need to access a file, the file is available locally in this quick, it’s close to you via network. And it’s also stored on faster storage within this FS01, for example. And you can have access to this data that’s faster but store it in Azure Files at a cheaper rate.

Vadim Vladimirskiy
The only way this works is when you use something called tiering. So tiering is a feature of Azure File Sync that basically is configurable, how aggressive you want it to be, but the concept behind tiering is you tell a file server and a sync server like FS01, what files to download and what files to only maintain pointers for.

Vadim Vladimirskiy
So if I am on WF 01 and I have 10 terabytes of storage up in Azure Files in a share called Public, that share is synced to FS01 via Azure File Sync. FS01 then shares that file called Public to me as a share called Public. So now I’m mapping a drive to it. I open it up and I see I have 10 terabytes worth of data with 10 million files, however many files are there. When I double click on the file, that file most likely isn’t going to be on this FS01 yet. It is going to be only in Azure Files. So what’s going to happen is FS01 is going to request that file and is going to get downloaded, stored locally, and then passed down to WF 01 and then as WF 02 or 03 or anyone else accesses that file, it’s going to now be local and changes that are made to that file are going to be uploaded to Azure Files and if that file is local on any of these other servers, it’s going to be downloaded there as well. But if not, it’s just going to be kept in Azure Files.

Vadim Vladimirskiy
And then eventually, once that file becomes old and stale and it’s not being touched often, it’s going to be tiered again, which means it’s going to be replaced with a pointer and the data is going to stay in Azure Files. So what you can do effectively is, take a huge amount of data and if most of it is inactive, you can allow it to be stored in very cheap storage in the cloud and make it accessible quickly through sync servers, which then present that data as file shares to the client VMs.

Vadim Vladimirskiy
So we talked about leveraging cheap storage. That’s one use case of Azure Files. The other use case is data migration. Let’s say you have this server right here, which is your own prime server and you have a lot of data on it, and you want to have a seamless way of migrating it into, let’s say this server. And you want to either use Azure Files long term or maybe not even use Azure Files long term. So what you’ll do is you set up an Azure File share. You’re going to create a new file sync service inside of that account. You’re then going to take this on prime server, a sync server, which is going to suck up all the data up here, and then you’re going to create this server also as a sync server and you’re going to set up tiering so that even though all the data will be stubbed or pointers will be available in FS01, none of it will actually get downloaded, or most of it won’t get downloaded to FS01.

Vadim Vladimirskiy
So you can kind of set up the structure, maybe even with a separate folder. Let’s say FS01 currently has a lot of data you want to migrate. You can set up a folder called a transfer, set up this whole structure, set up Azure File Share, set up Azure File Sync service, set up a transfer folder here and then simply copy files from wherever they currently are on File 01 into that transfer folder and those will automatically start appearing in FS01.

Videos in the series