Skip to main content

How to autoscale Azure

This guide outlines how to configure native Azure autoscaling, details common limitations, and compares native methods with advanced logic.

 

Carisa Stringer | December 22, 2025

What is Azure Local?

Azure autoscaling is the process of dynamically adding or removing cloud resources, such as Virtual Machines (VMs) and Azure Virtual Desktop (AVD) session hosts, to align with fluctuating workload demands. For IT professionals, mastering this capability is essential. It serves as the primary mechanism for balancing consistent user performance during peak operations against strict budget controls. 

This guide details the configuration of native Azure scaling rules, identifies common enterprise limitations, and examines how third-party optimization tools like Nerdio address gaps in native logic.

What is Azure autoscaling and how does it work?

Azure autoscaling is a "scale-out" (horizontal) and "scale-in" mechanism that adjusts your infrastructure footprint based on usage metrics. Unlike "scaling up" (vertical scaling), which increases the power of a single server, autoscaling adds or removes entire instances to handle fluctuating load.

The diagram below highlights the structural difference between these two approaches:

  • Scaling Up (Vertical): This traditional method involves resizing a single specific VM to add more CPU or RAM. It is rarely used for dynamic demand because it typically requires a system reboot and downtime.
  • Scaling Out (Horizontal): This is the cloud-native "Autoscale" method. Instead of making one server stronger, Azure automatically provisions additional identical VM instances to share the workload, ensuring zero downtime for users.

At its core, the Azure native autoscaling engine relies on Azure Monitor to track specific metrics and trigger the pre-defined autoscale rules configured within your autoscale settings. When a threshold is breached, an action is executed.

  • Common Native Triggers:
    • CPU Percentage: e.g., "Scale out if average CPU > 75%."
    • Memory Usage: e.g., "Scale out if available RAM < 1 GB."
    • Queue Depth: Used primarily for App Services to add workers when job queues back up.
    • Time-Based Schedules: Hard-coded start/stop times (e.g., "Start 10 VMs at 8:00 AM").

When these rules trigger, Azure executes an operation on your Virtual Machine Scale Set (VMSS) or AVD Host Pool, effectively provisioning new capacity (scale-out) or deallocating existing resources (scale-in) to save money. In standard application scenarios, this process works in tandem with an Azure Load Balancer to ensure that incoming network traffic is efficiently distributed across the newly provisioned instances.

How do I configure native autoscaling in Azure?

Configuring native autoscaling requires you to define a resource group, a scalable resource (like a VMSS), and a set of logic rules. While powerful, the native path often involves significant manual configuration or PowerShell scripting. This operational overhead increases for organizations leveraging Azure Local, where maintaining consistent autoscaling logic across both cloud and on-premises infrastructure often requires custom development.

Step-by-Step: Virtual Machine Scale Sets (VMSS)

  1. Navigate to your resource: Open the Azure Portal and select your Virtual Machine Scale Set.
  2. Access Scaling settings: In the left-hand menu, click on Scaling (sometimes labeled Autoscale).
  3. Enable Custom Autoscale: Select "Custom autoscale" to define your own logic.
  4. Define a Scale Out Rule:
    • Metric Source: Current resource.
    • Criteria: Average CPU percentage > 75% over a 10-minute duration.
    • Action: Increase instance count by 1.
  5. Define a Scale In Rule:
    • Criteria: Average CPU percentage < 25% over a 10-minute duration.
    • Action: Decrease instance count by 1.
  6. Set Instance Limits: Always define a Minimum (to prevent total outage), Maximum (to prevent budget overruns), and Default instance count. These settings strictly govern the allowable number of instances in the scale set, ensuring that automation never provisions more resources than your budget permits.

Step-by-Step: Azure Virtual Desktop (AVD)

Native AVD autoscaling uses a feature called "Autoscale scaling plans."

  1. Create a Scaling Plan: Search for "Scaling plans" in the Azure search bar and create a new object.
  2. Define Schedules: You must manually define distinct phases for each day:
    • Ramp-up: When to start booting VMs (e.g., 7:00 AM).
    • Peak: The period of highest usage.
    • Ramp-down: When to start restricting new sessions.
    • Off-peak: The quiet hours where most machines should be off.
  3. Assign to Host Pools: Link your Scaling Plan to specific AVD Host Pools.

Note: Native AVD scaling treats "Pooled" and "Personal" host pools differently. You generally cannot mix logic types effectively without creating separate plans.

What are the limitations of native Azure autoscaling?

While native tools are free to use, they often lack the "business context" required for complex enterprise environments. This gap leads to two main problems: frustrated users (due to poor performance) or frustrated finance teams (due to wasted spend).

  • Metric Blindness: Native scaling primarily looks at infrastructure stats (CPU/RAM). It often fails to understand user experience. A CPU might be low, but if the VM is out of user session slots, native scaling won't know to add more capacity.
  • The "Log-off" Problem: Native scale-in is often brutal. When Azure decides to scale in, it may simply shut down a VM, potentially disrupting users. "Drain mode" exists but can be clumsy to orchestrate natively without custom scripting.
  • Hidden Storage Costs: This is a critical oversight. When you stop a VM in Azure natively, you stop paying for compute, but you continue paying for the Premium SSD storage attached to it. Additionally, relying on stopped disks for persistence does not replace the need for a comprehensive Azure backup strategy to recover from data corruption or accidental deletion.
  • Scripting Overhead: To achieve advanced logic—like scaling based on specific user groups or complex session states—IT teams are often forced to write and maintain complex PowerShell runbooks or Logic Apps. Similarly, these custom scripts are often the only way to automatically de-provision unused AVD desktops based on inactivity thresholds rather than just powering them down.

Nerdio vs. Native Azure Autoscaling: What is the difference?

Nerdio Manager for Enterprise sits on top of Azure to provide "logic" that native tools miss. The key difference is that Nerdio scales based on user behavior and cost-efficiency, whereas native Azure scales based on server statistics.

This table outlines the functional differences:

Native Azure Autoscaling Nerdio Manager Autoscaling
Primary Trigger CPU, RAM, Time Schedules Available Sessions, User Counts, CPU, RAM
Scale-In Logic Basic VM shutdown (often disruptive) Intelligent Drain Mode (waits for logout) + Empty Host cleanup
Storage Optimization Manual / Scripted only Auto-scale Storage: Swaps Premium SSD to HDD when VM stops
Heal Capability None (requires manual intervention) Auto-Heal: Detects and repairs broken hosts automatically
Cost Savings Moderate (Compute only) High (Compute + Storage + Just-in-Time provisioning)

How does Nerdio handle enterprise autoscaling?

Nerdio addresses the limitations of native scaling by introducing "intent-based" automation. Instead of just reacting to CPU spikes, it anticipates user needs.

1. Session-based scaling

Nerdio uses a "Buffer" logic (such as, "Always keep 2 sessions open") to proactively manage user demand. This ensures that new capacity is available before users need it, eliminating the "boot storm" delays common with native CPU-based triggers.

The diagram below illustrates how this proactive logic prevents users from waiting for a VM to boot:

  • State A (Stable): The environment is healthy with 15 active users and a capacity of 20. The "Buffer" is satisfied with 5 open slots.
  • State B (Trigger): As more users log in, the number of open slots drops to 2. Nerdio detects that the buffer threshold has been breached.
  • State C (Action): Nerdio immediately boots a new VM (VM #3) while the current users are still logging in, ensuring capacity grows faster than demand.
  • State D (Result): By the time the 21st user attempts to connect, the new VM is online and ready. The user connects instantly with zero wait time.

2. Storage auto-scaling

This is a unique cost-saver. A running VM needs a fast, expensive Premium SSD. A stopped VM does not. Nerdio automatically downgrades the OS disk to a cheap Standard HDD the moment the VM stops, and upgrades it back to SSD seconds before it starts. This alone can save ~75% on storage costs for non-persistent machines.

3. Pre-staging resources

Rather than a simple "Ramp-up" time, Nerdio allows for granular pre-staging. You can specify exactly how many hosts should be ready by 8:00 AM, ensuring the morning login wave is smooth, without paying for those VMs to run all night.

4. Personal desktop scaling

Native Azure struggles to scale "Personal" (1:1 assigned) desktops efficiently. Nerdio allows you to power off a personal desktop automatically when the specific user logs off or disconnects, and power it back on—via a "Start on Connect" feature—the moment they try to access it again.

Best practices for autoscaling in Azure

Whether you use native tools or Nerdio, adhering to these engineering principles will save you from outage scenarios and "bill shock." Implementing these safeguards is a critical step in establishing the best practices for using automation and auto-scaling to manage AVD cost and performance.

  • Avoid "Flapping": Ensure a wide gap between your scale-out and scale-in thresholds. If you scale out at 80% CPU and in at 70%, a minor fluctuation will cause the VM to continuously boot and shut down (flapping), which kills performance and inflates costs.
  • Always Set Limits: Never leave "Maximum instances" uncapped. A misconfigured script or a run-away process could theoretically spin up hundreds of VMs, resulting in a massive bill.
  • Test "Drain" Times: Analyze your user behavior. If users typically stay logged in for 8 hours, aggressive scale-in rules (e.g., shutting down after 10 minutes of low CPU) will constantly interrupt them. Tune your "Drain Mode" timeouts accordingly.
  • Monitor the Monitor: Autoscaling relies on metrics. If the metric agent fails, scaling fails. Set up an alert (using Azure Monitor or Nerdio) that notifies you if the scaling engine itself hasn't reported a heartbeat.

Conclusion

Autoscaling is the difference between a cloud environment that burns cash and one that drives value. While native Azure autoscaling provides the fundamental building blocks for simple, static workloads, it often requires significant manual effort to tune for complex, user-centric environments.

For enterprise IT teams managing Azure Virtual Desktop or dynamic VM workloads, Nerdio offers a necessary layer of intelligence. By scaling based on actual user sessions and automating storage tiering, it solves the technical gaps left by native tools—delivering a smoother user experience and deeper cost reductions.

Next Step: Are you overpaying for your Azure compute and storage? Use Nerdio’s Cost Estimator to model your environment and see exactly how much you could save with optimized autoscaling logic.

Optimize and save

See how you can optimize processes, improve security, increase reliability, and save up to 70% on Microsoft Azure costs.

Frequently asked questions


About the author

Photo of Carisa Stinger

Carisa Stringer

Head of Product Marketing

Carisa Stringer is the Head of Product Marketing at Nerdio, where she leads the strategy and execution of go-to-market plans for the company’s enterprise and managed service provider solutions. She joined Nerdio in 2025, bringing 20+ years of experience in end user computing, desktops-as-a-service, and Microsoft technologies. Prior to her current role, Carisa held key product marketing positions at Citrix and Anthology, where she contributed to innovative go-to-market initiatives. Her career reflects a strong track record in driving growth and adoption in the enterprise technology sector. Carisa holds a Bachelor of Science in Industrial Engineering from the Georgia Institute of Technology.

Ready to get started?