Right Sizing Opportunity

Overview

The Right Sizing Opportunity recommendation leverages Azure Advisor's native right-sizing recommendations to identify underutilized virtual machines and suggest appropriate resizing or shutdown actions to optimize costs.

How It Works

DigiUsher integrates with Azure Advisor API to retrieve machine learning-based right-sizing recommendations. Azure Advisor analyzes VM usage patterns and suggests either resizing to smaller SKUs or shutting down underutilized resources.

Supported Cloud Providers

  • Azure - Virtual machines

Selection Criteria

Recommendations are generated when:

  • Azure Advisor identifies right-sizing opportunities with recommendation type ID e10b1381-5f0a-47ff-8c7b-37bd13d7c974
  • The virtual machine resource is monitored by DigiUsher
  • Azure Advisor provides valid recommendation data with savings calculations
  • Two types of recommendations are supported:
    • SKU Change: Resize to smaller VM size
    • Shutdown: Shut down underutilized VMs

Configuration Options

ParameterDefaultDescription
skip_cloud_accounts[]Cloud accounts to skip during analysis

Recommendation Types

SKU Change Recommendations

  • Resize Down: Move to smaller VM size within same family
  • Family Change: Move to different VM family with better price-performance
  • Generation Upgrade: Move to newer generation with better efficiency
  • Specialized SKUs: Move to cost-optimized or burstable SKUs

Shutdown Recommendations

  • Idle VMs: VMs with minimal CPU and network activity
  • Unused Resources: VMs that appear to be completely unused
  • Development Resources: Non-production VMs that can be shut down
  • Temporary Resources: VMs created for temporary purposes

Savings Calculation

Savings are calculated using Azure Advisor data:

Monthly Savings = Azure Advisor Savings Amount × Exchange Rate

The calculation considers:

  • Current VM costs based on actual usage
  • Recommended action savings (resize or shutdown)
  • Currency conversion to organization currency
  • Azure's machine learning analysis of usage patterns

Example Output

{
  "cloud_resource_id": "/subscriptions/12345678-1234-5678-9012-123456789012/resourcegroups/monitoring-rg/providers/microsoft.compute/virtualmachines/monitoring-server",
  "resource_name": "monitoring-server",
  "problem": "Right-size or shutdown underutilized virtual machines",
  "solution": "Right-size underutilized virtual machines. Change from Standard_D32s_v5 to Standard_E16as_v5",
  "saving_currency": "USD",
  "region": "eastus",
  "current_flavor": "Standard_D32s_v5",
  "recommended_flavor": "Standard_E16as_v5",
  "cpu_utilization": "16",
  "memory_utilization": "46",
  "network_utilization": "0",
  "recommendation_type": "SkuChange",
  "resource_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "cloud_account_id": "12345678-1234-5678-9012-123456789012",
  "cloud_type": "azure_cnr",
  "cloud_account_name": "Production-Operations",
  "last_seen": 1702123456,
  "saving": 1200.50,
  "detected_at": 1702000000,
  "pool": {
    "id": "abcd1234-ef56-7890-abcd-123456789012",
    "name": "Operations",
    "purpose": "budget"
  }
}

Azure Advisor Integration

Machine Learning Analysis

  • Usage Pattern Recognition: Identifies consistent underutilization patterns
  • Workload Analysis: Analyzes CPU, memory, and network usage
  • Time-Based Patterns: Considers usage patterns over time
  • Resource Dependencies: Considers dependencies when making recommendations

Recommendation Quality

  • Confidence Levels: High, Medium, Low confidence ratings
  • Historical Analysis: Based on 7-30 days of historical data
  • Performance Impact: Considers performance impact of changes
  • Cost-Benefit Analysis: Weighs savings against potential performance impact

Risk Assessment

Low Risk

  • VMs with consistently low utilization (< 10% CPU)
  • Development and testing environments
  • VMs with clear oversizing (e.g., 8 cores used at 5%)

Medium Risk

  • VMs with moderate utilization (10-30% CPU)
  • Production VMs with predictable workloads
  • VMs where resizing maintains adequate performance headroom

High Risk

  • VMs with variable workload patterns
  • Critical production systems
  • VMs with strict performance requirements

Best Practices

Before Implementation

  1. Validate Recommendations: Review Azure Advisor recommendations carefully
  2. Performance Testing: Test recommended sizes in non-production first
  3. Monitoring Setup: Establish monitoring before making changes
  4. Backup Plans: Ensure ability to quickly resize back if needed

Implementation Strategy

  1. Start with Development: Begin with non-production environments
  2. Gradual Changes: Make incremental size changes
  3. Monitor Impact: Close monitoring during and after changes
  4. User Feedback: Gather feedback on application performance

Post-Implementation

  1. Performance Validation: Verify performance meets requirements
  2. Cost Tracking: Confirm cost savings are realized
  3. Ongoing Monitoring: Continue monitoring for optimization opportunities
  4. Documentation: Document changes and lessons learned

VM Family Considerations

General Purpose (B, D, F series)

  • B-series: Burstable performance for variable workloads
  • D-series: Balanced CPU-to-memory ratio for most workloads
  • F-series: High CPU-to-memory ratio for compute-intensive tasks

Compute Optimized (F series)

  • High CPU: Optimized for CPU-intensive applications
  • Low Memory: Lower memory-to-CPU ratio
  • Use Cases: Web servers, application servers, batch processing

Memory Optimized (E, M series)

  • High Memory: High memory-to-CPU ratio
  • Use Cases: Databases, in-memory analytics, caching

Storage Optimized (L series)

  • High Disk Throughput: Optimized for storage-intensive workloads
  • Use Cases: Big data, databases, distributed file systems

Implementation Considerations

Downtime Requirements

  • Resize Operations: Most resizing requires VM restart
  • Planned Maintenance: Schedule during maintenance windows
  • High Availability: Consider impact on HA configurations
  • Load Balancers: Update load balancer configurations if needed

Performance Impact

  • CPU Performance: Ensure adequate CPU for peak loads
  • Memory Requirements: Verify memory requirements are met
  • Network Performance: Consider network performance changes
  • Storage Performance: Ensure storage performance is adequate

Application Compatibility

  • Resource Requirements: Verify applications work with smaller resources
  • Performance Testing: Test applications under load
  • Monitoring Thresholds: Update monitoring thresholds for new sizes
  • Scaling Policies: Update auto-scaling policies if applicable

Monitoring and Validation

Key Metrics

  • CPU Utilization: Monitor CPU usage after resizing
  • Memory Utilization: Track memory usage patterns
  • Application Performance: Monitor application response times
  • User Experience: Track user-reported performance issues

Alert Configuration

  • High Utilization: Alerts for sustained high resource usage
  • Performance Degradation: Alerts for performance issues
  • Application Errors: Monitor for increased error rates
  • Capacity Planning: Alerts for when further scaling is needed

API Integration

This recommendation is available through the DigiUsher API:

  • Recommendation type: right_sizing_opportunity
  • Provides Azure Advisor integration data
  • Supports bulk operations for right-sizing initiatives
  • Includes confidence levels and savings calculations

Automation Opportunities

Automated Analysis

  • Continuous Monitoring: Regular analysis of VM utilization
  • Recommendation Updates: Regular updates from Azure Advisor
  • Pattern Recognition: Identify trends in resource usage
  • Cost Tracking: Automated tracking of optimization savings

Automated Resizing

  • Policy-Based Resizing: Automated resizing based on policies
  • Approval Workflows: Automated approval processes for changes
  • Testing Integration: Automated testing before production changes
  • Rollback Automation: Automated rollback on performance issues

Cost Optimization Strategy

Immediate Actions

  • Quick Wins: Implement obvious oversizing fixes
  • Development First: Start with non-critical environments
  • High-Impact Changes: Focus on VMs with highest savings potential

Long-Term Strategy

  • Regular Reviews: Monthly reviews of Azure Advisor recommendations
  • Optimization Culture: Build culture of continuous optimization
  • Automation Integration: Integrate with DevOps and automation workflows
  • Cost Awareness: Increase team awareness of VM costs and optimization

Compliance and Governance

Change Management

  • Approval Processes: Require approval for production VM changes
  • Impact Assessment: Assess potential impact before changes
  • Documentation: Document all resizing decisions and outcomes
  • Rollback Procedures: Maintain clear rollback procedures

Performance Governance

  • SLA Compliance: Ensure changes don't violate SLAs
  • Performance Standards: Maintain performance standards
  • Monitoring Requirements: Ensure adequate monitoring is in place
  • Incident Response: Update incident response for new configurations