Scaling Methods
Ryvn supports two primary methods for scaling your applications:Manual Scaling
Run a fixed number of replicas that you specify.
Autoscaling
Automatically adjusts replica count based on one or more triggers: CPU/memory utilization, Temporal task queue depth, or RabbitMQ queue metrics.
Manual Scaling
Manual scaling gives you direct control over your application’s capacity by running a fixed number of replicas. Each replica runs an identical copy of your application, and Ryvn’s load balancer automatically distributes incoming requests across all available replicas. To configure manual scaling:
When you update the number of replicas, Ryvn immediately begins provisioning or deprovisioning replicas to match your
desired count. You can scale your installation up to a maximum of
100 replicas.
Autoscaling
Autoscaling provides dynamic resource management by automatically adjusting the number of running replicas based on real-time metrics. You can combine multiple triggers so that Ryvn scales on whichever metric demands the most replicas.Configuration
To configure autoscaling:Understanding Autoscaling Parameters
Understanding Autoscaling Parameters
minReplicas: Ryvn will maintain at least this many replicas, even during low utilization. Set to 0 to allow scaling to zero when all triggers are idle.maxReplicas: Ryvn will not exceed this number of replicas, even during high utilizationtriggers: One or more scaling triggers that determine when to scale (see below)
Trigger Types
You can add any combination of the following triggers. When multiple triggers are active, Ryvn scales based on whichever trigger requires the most replicas.CPU / Memory
Scale based on average CPU and/or memory utilization across all running replicas.- CPU target: The CPU utilization percentage that triggers scaling (recommended:
70-80%) - Memory target: The memory utilization percentage that triggers scaling (recommended:
80-85%)
new_replicas = ceil(current_replicas * (current_util / target_util)) to calculate the desired replica count.
Temporal Queue
Scale based on the depth of a Temporal task queue. This is useful for worker installations that process tasks from a Temporal workflow.- Integration: Select a configured Temporal integration for your organization
- Task Queue: The name of the Temporal task queue to monitor
- Target Queue Size: The number of pending tasks per replica that triggers scaling
RabbitMQ Queue
Scale based on RabbitMQ queue metrics. This is useful for consumer installations that process messages from a RabbitMQ queue.- Connection Method: Either provide a direct endpoint URL or reference an environment variable (defaults to
RABBITMQ_HOST) - Queue Name: The RabbitMQ queue to monitor
- Scaling Mode: Scale by
Queue Length(pending messages per replica) orMessage Rate(messages per second per replica) - Protocol:
AMQPorHTTP(the HTTP management API).Message Ratemode requiresHTTP - Target Value: The threshold per replica that triggers scaling
How Autoscaling Works
Ryvn continuously monitors the metrics defined by your triggers. When any trigger exceeds its target, the following process occurs:Evaluate Triggers
Ryvn evaluates each active trigger and calculates the replica count each one requires.
Scale Up
If more replicas are needed, Ryvn immediately provisions new replicas to handle the increased load.
Best Practices
When implementing scaling for your applications, consider the following recommendations:Initial Setup
Initial Setup
Start with manual scaling to understand your application’s baseline resource needs and traffic patterns. This information will help you set appropriate autoscaling parameters later.
Resource Targets
Resource Targets
Set your CPU target between 70-80% and memory target between 80-85%. These ranges provide good resource utilization
while maintaining headroom for traffic spikes.
Queue-Based Scaling
Queue-Based Scaling
For Temporal or RabbitMQ triggers, start with a conservative target value and adjust based on your workers’ throughput.
Combine a queue trigger with a CPU/Memory trigger to handle both queue backlogs and resource-intensive workloads.
High Availability
High Availability
For production servers, configure a minimum of two replicas to maintain high availability. Consider your application’s
cold start time when setting the minimum replica count.
Monitoring
Monitoring
Regularly review your scaling patterns and adjust thresholds based on actual usage and application performance metrics.
Monitoring and Debugging
You can monitor your server’s scaling behavior through the Ryvn Dashboard, which provides real-time visibility into:- Current replica count and status
- Scaling events history
- CPU and memory utilization metrics
- Queue depth and message rate for queue-based triggers
For advanced scaling scenarios or help with debugging scaling issues, please reach out to our support team.