Skip to main content

NERDIO GUIDE

How to automate right-sizing AVD images based on performance data

Amol Dalvi | July 28, 2025

Overview

Automating the right-sizing of your Azure Virtual Desktop (AVD) images is the process of using real-world performance data to ensure your session host virtual machines (VMs) are provisioned with the optimal amount of resources. By creating automated workflows, you can eliminate wasteful spending on oversized VMs and prevent the poor user experience caused by undersized ones. 

This data-driven approach ensures your AVD environment operates at peak efficiency, balancing cost and performance without constant manual intervention.

What is automated AVD image right-sizing and why is it important?

Understanding the core concepts behind right-sizing is the first step toward building a more efficient and cost-effective AVD environment. This process is about making informed, data-driven decisions rather than relying on guesswork or initial estimates.

What does "right-sizing" mean in the context of Azure Virtual Desktop?

In AVD, right-sizing is the practice of matching a session host VM's resources—specifically its CPU, RAM, and disk type—to the actual demands of the user workloads it supports. It involves analyzing performance over time to find the "goldilocks" size for your images. This avoids two common and costly problems:

  • Over-provisioning: Assigning more CPU or RAM than your users actually need, leading to unnecessary Azure spending.
  • Under-provisioning: Not assigning enough resources, resulting in slow application performance, system lag, and a frustrating user experience.

Why should you base AVD right-sizing on performance data?

Basing your sizing decisions on actual performance data is the only way to know for sure what your users require. Initial deployments are often based on vendor recommendations or assumptions that don't reflect the unique way your users interact with their applications. By collecting and analyzing metrics directly from your session hosts, you replace assumptions with facts, enabling precise adjustments that directly correlate to real-world usage patterns.

What are the primary benefits of right-sizing AVD images?

When you consistently right-size your AVD images, you unlock significant advantages for your organization. The benefits go beyond just saving money and create a more stable and efficient digital workspace.

  • Significant Cost Reduction: Eliminate wasted spend on underutilized resources. For many organizations, this is the single most impactful method for controlling Azure costs.
  • Enhanced User Experience: Provide users with the resources they need for a responsive and productive desktop experience, reducing performance-related help desk tickets.
  • Increased Operational Efficiency: Automating the analysis and resizing process frees up IT administrators from tedious manual monitoring and adjustments, allowing them to focus on more strategic initiatives.

What key performance metrics should you collect for AVD right-sizing?

To right-size effectively, you need to collect the right data. Focusing on a few key performance indicators (KPIs) will give you a clear picture of user demand and session host health without drowning you in unnecessary information.

What are the most critical CPU metrics to monitor?

CPU pressure is a common cause of poor VDI performance. Tracking these metrics will tell you if your session hosts are keeping up with processing demands.

  • Processor% Processor Time: This is the primary indicator of overall CPU utilization. Consistently high utilization (e.g., above 85-90% for sustained periods) is a clear sign that the VM is undersized.
  • Processor Queue Length: This represents the number of threads waiting for processor time. A queue length that is consistently greater than the number of virtual CPUs (vCPUs) indicates a CPU bottleneck.

What memory metrics are essential for analysis?

Insufficient memory forces the operating system to rely on the much slower disk page file, drastically degrading performance. These metrics help you track memory pressure.

  • Memory\Available MBytes: This shows how much memory is free and available for use. If this value is consistently low, the VM is likely a candidate for a memory upgrade.
  • Memory\Pages/sec: This metric tracks the rate at which pages are read from or written to the disk to resolve hard page faults. A sustained high value indicates the system does not have enough RAM for its workload.

What disk and network performance indicators should be tracked?

While CPU and memory are primary, disk and network performance are also crucial for application responsiveness and the overall user experience.

  • LogicalDisk\Avg. Disk sec/Transfer: This is a measure of disk latency. High latency values can make applications feel slow, even if CPU and RAM are adequate. This metric is critical when choosing between Standard and Premium SSDs.
  • Network Interface\Bytes Total/sec: This helps you understand the network bandwidth consumption of your users, which can be important for sizing decisions in environments with network-intensive applications.

Where can you collect this performance data?

Azure provides robust, built-in tools for collecting and storing this information. You can gather all the necessary metrics from these primary sources:

  • Azure Monitor Metrics: The quickest way to view performance data for any Azure VM.
  • Log Analytics Workspace: The central repository for collecting, consolidating, and analyzing performance counter data from your session hosts at scale. This is the foundation for any serious analysis or automation effort.

Know the TCO

This step-by-step wizard tool gives you the total cost of ownership for AVD in your organization.

How can you manually analyze performance data to right-size an image?

Before you can automate, it's essential to understand the logic behind the process. Manually analyzing your performance data helps you learn your environment's unique patterns and forms the basis for any automated workflow you build later.

How do you establish a performance baseline?

A baseline is a snapshot of your environment's performance over a representative period. It's the standard against which you'll measure future performance.

  1. Configure Data Collection: Ensure your session host VMs are sending performance counter data (for CPU, memory, and disk) to a Log Analytics workspace.
  2. Choose a Timeframe: Collect data for a full business cycle to capture daily and weekly peaks. This is typically between two to four weeks.
  3. Identify "Normal": Analyze the data from this period to understand what constitutes normal and peak usage for your user groups. This baseline becomes your reference point for sizing decisions.

What is the process for analyzing collected metrics in Azure?

Once you have data in Log Analytics, you can use Azure's tools to query and visualize it. The goal is to find the sweet spot that covers peak demand without significant over-provisioning.

  • Use Kusto Query Language (KQL): Write queries in Log Analytics to find key values like average and peak utilization. A common and highly effective target for right-sizing is the 95th percentile, which represents the performance level that is met or exceeded 95% of the time, effectively ignoring outlier spikes.
  • Visualize with Azure Workbooks: Use Azure Workbooks to create rich visualizations and dashboards from your KQL queries. This makes it much easier to spot trends and present your findings to stakeholders.

How do you translate performance analysis into a new VM size?

With your analysis complete, the final step is to select a new VM instance type or size.

  1. Match Data to a Size: Compare your 95th percentile CPU and memory usage to the available Azure VM sizes. For example, if your analysis shows a workload consistently needs around 3 vCPUs and 14 GB of RAM, a Standard_D4s_v5 (4 vCPU, 16 GiB RAM) would be a good fit, whereas a D8s_v5 would be oversized.
  2. Consider VM Families: Look at different VM families (e.g., D-series for general purpose, E-series for memory-optimized) to see if a different type would be more cost-effective for your workload profile.
  3. Implement and Monitor: After resizing, monitor the VM to ensure the change had the desired effect and did not negatively impact user experience.

What are the methods for automating the AVD right-sizing process?

While manual analysis is insightful, true operational efficiency comes from automation. There are several methods for automating the right-sizing process in Azure, ranging from custom scripts to comprehensive third-party platforms.

The diagram below illustrates the ideal, continuous loop for automated AVD right-sizing. This process uses live performance data to drive an automation engine, which constantly adjusts resources to balance cost savings with user performance. The following sections explore the different methods—from custom scripts to dedicated platforms—that you can use to build and implement this powerful optimization cycle.

This table provides a high-level comparison of the different approaches you can take to automate AVD right-sizing.

Automation Method Comparison

Method Primary Tool(s) Complexity Best For Key Consideration
Custom PowerShell Scripts PowerShell, Azure Modules (Az.Compute, Az.OperationalInsights) High Organizations that require highly customized, granular control and have deep in-house scripting expertise. Requires significant development, testing, and ongoing maintenance. There is no native user interface for management.
Azure Automation Azure Automation Runbooks High Teams that have already developed PowerShell scripts and need a native Azure service to schedule and run them automatically. This is primarily a scheduler and host for your scripts; the complex scaling and analysis logic must still be built and maintained from scratch.
Azure Logic Apps Logic Apps Designer Medium IT teams that prefer a low-code, visual workflow designer for creating event-driven automation and integrating with other services (e.g., Teams, Outlook). Workflows for complex scaling logic can become difficult to manage, and cost can increase with high-frequency operations.
Nerdio Platform Nerdio Manager for Enterprise Low Enterprises seeking a turnkey, easy-to-manage solution with a rich feature set specifically designed for AVD cost and performance optimization. This is a comprehensive commercial platform that abstracts away the underlying complexity, providing a UI-driven, auditable, and fully supported solution.

To help you decide which path is right for your organization, the following sections explore each of these methods in greater detail.

How can you use PowerShell scripts for a custom automation solution?

For teams with strong scripting skills, using PowerShell provides granular control over the automation logic.

  • How it works: You can write a PowerShell script that uses the Az.OperationalInsights module to query performance data from your Log Analytics workspace. The script can then analyze this data and use the Az.Compute module to resize VMs that meet your predefined criteria (e.g., CPU utilization below 20% for 7 days).
  • Best for: Organizations that require highly customized logic and have the in-house expertise to develop and maintain complex scripts.

What is the role of Azure Automation in scheduling and managing scripts?

Azure Automation takes your custom PowerShell scripts and turns them into a manageable, schedulable service.

  • How it works: You import your right-sizing PowerShell script into an Azure Automation account as a "runbook." You can then set a schedule (e.g., run every Sunday at 2:00 AM) to execute the script automatically. This removes the need for a dedicated server to run the script and provides a basic framework for logging and management.
  • Best for: Teams who have already developed PowerShell scripts and need a simple, serverless way to run them on a recurring schedule.

How can you leverage Azure Logic Apps for a low-code workflow approach?

Azure Logic Apps provide a more visual, workflow-based approach to automation, which can be easier to manage than pure code.

  • How it works: You can design a Logic App that triggers on a schedule. The workflow can include a step to run a KQL query against Log Analytics, parse the results, and then loop through each VM that needs resizing, calling the Azure Resource Manager API to perform the change.
  • Best for: IT teams that prefer a low-code, visual designer for building workflows and integrating different services (e.g., sending a notification to Microsoft Teams after a VM is resized).

How can you use a dedicated platform like Nerdio for turnkey automation?

For enterprises seeking a robust, feature-rich solution without the complexity of building and maintaining a custom system, a dedicated platform is the most efficient method.

  • How it works: Nerdio Manager for Enterprise provides a sophisticated, built-in auto-scaling engine designed specifically for AVD. From a user-friendly interface, you define the desired performance and cost parameters. Nerdio continuously monitors the environment and automatically scales session hosts up, down, in, or out based on real-world usage—including CPU, RAM, and active user sessions.
  • What it simplifies: This platform-based approach removes the need to write PowerShell scripts, manage KQL queries, or configure separate Azure services. The complex logic for baselining, scaling, and cost optimization is pre-built and managed within the Nerdio interface.
  • Enterprise-grade advantages: Nerdio offers features beyond simple resizing, such as predictive scaling (scaling up just before business hours), cost-based optimization (choosing the most economical VM to start), and detailed reporting on all scaling actions and their associated cost savings. This provides a comprehensive, manageable, and auditable solution ideal for large-scale AVD deployments.

See this demo to learn how you can optimize processes, improve security, increase reliability, and save up to 70% on Microsoft Azure costs.

What are some best practices for any automation method?

Regardless of the tool you choose, certain best practices are universal for ensuring your automation is safe and effective.

  • Tag Everything: Apply Azure tags to all resources that are part of the automation. This allows your scripts or platform to easily identify which VMs to include or exclude from right-sizing actions.
  • Schedule Off-Hours: Run resizing operations during periods of low usage, such as overnight or on weekends, to avoid disrupting active user sessions.
  • Implement "Do Not Disturb" Logic: Include a mechanism (like a specific tag) to prevent the automation from resizing critical VMs or those undergoing maintenance.
  • Start with Notifications: Before enabling actual resizing actions, run your automation in a "read-only" mode that only sends notifications (via email or Teams) about the changes it would have made. This helps you validate your logic safely.

Optimize and save

See how you can optimize processes, improve security, increase reliability, and save up to 70% on Microsoft Azure costs.

Frequently Asked Questions

Related resources

About the author

Amol Dalvi

VP, Product

Software product executive and Head of Product at Nerdio, with 15+ years leading engineering teams and 9+ years growing a successful software startup to 20+ employees. A 3x startup founder and angel investor, with deep expertise in Microsoft full stack development, cloud, and SaaS. Patent holder, Certified Scrum Master, and agile product leader.

Ready to get started?