Scaling

Handle varying workloads by running multiple replicas of your Servers. For installations that receive traffic, Ryvn automatically distributes it evenly across your running replicas.

Installations with persistent storage cannot scale to multiple replicas.

Scaling methods

Ryvn supports two primary methods for scaling your applications:

Manual scaling

Run a fixed number of replicas that you specify.

Autoscaling

Automatically adjusts replica count based on one or more triggers: CPU/memory utilization, Temporal task queue depth, or RabbitMQ queue metrics.

Manual scaling

Manual scaling gives you direct control over your application’s capacity by running a fixed number of replicas. Each replica runs an identical copy of your application, and Ryvn’s load balancer automatically distributes incoming requests across all available replicas. To configure manual scaling:

Navigate to Environments

Go to the Environments tab in the Ryvn Dashboard

Select environment

Choose the environment containing your installation

Select installation

Click on the installation you want to configure

Configure scaling

In the Scaling section, select Manual Scaling
Enter the desired number of replicas (up to 100)
Click Save to apply the changes

When you update the number of replicas, Ryvn immediately begins provisioning or deprovisioning replicas to match your desired count. You can scale your installation up to a maximum of 100 replicas.

Autoscaling

Autoscaling provides dynamic resource management by automatically adjusting the number of running replicas based on real-time metrics. You can combine multiple triggers so that Ryvn scales on whichever metric demands the most replicas.

Configuration

To configure autoscaling:

Navigate to Environments

Go to the Environments tab in the Ryvn Dashboard

Select environment

Choose the environment containing your installation

Select installation

Click on the installation you want to configure

Enable autoscaling

In the Scaling section, select Autoscaling
Set the minimum and maximum number of replicas
Click Add Trigger to add one or more autoscaling triggers
Click Save to apply the changes

Understanding autoscaling parameters

minReplicas: Ryvn will maintain at least this many replicas, even during low utilization. Set to 0 to allow scaling to zero when all triggers are idle.
maxReplicas: Ryvn will not exceed this number of replicas, even during high utilization
triggers: One or more scaling triggers that determine when to scale (see below)

Trigger types

You can add any combination of the following triggers. When multiple triggers are active, Ryvn scales based on whichever trigger requires the most replicas.

CPU / memory

Scale based on average CPU and/or memory utilization across all running replicas.

CPU target: The CPU utilization percentage that triggers scaling (recommended: 70-80%)
Memory target: The memory utilization percentage that triggers scaling (recommended: 80-85%)

You can enable both metrics or just one. Ryvn uses the formula new_replicas = ceil(current_replicas * (current_util / target_util)) to calculate the desired replica count.

Temporal queue

Scale based on the depth of a Temporal task queue. This is useful for worker installations that process tasks from a Temporal workflow.

Integration: Select a configured Temporal integration for your organization
Task Queue: The name of the Temporal task queue to monitor
Target Queue Size: The number of pending tasks per replica that triggers scaling

You can optionally override the endpoint and namespace from the selected integration.

RabbitMQ queue

Scale based on RabbitMQ queue metrics. This is useful for consumer installations that process messages from a RabbitMQ queue.

Connection Method: Either provide a direct endpoint URL or reference an environment variable (defaults to RABBITMQ_HOST)
Queue Name: The RabbitMQ queue to monitor
Scaling Mode: Scale by Queue Length (pending messages per replica) or Message Rate (messages per second per replica)
Protocol: AMQP or HTTP (the HTTP management API). Message Rate mode requires HTTP
Target Value: The threshold per replica that triggers scaling

How autoscaling works

Ryvn continuously monitors the metrics defined by your triggers. When any trigger exceeds its target, the following process occurs:

Evaluate triggers

Ryvn evaluates each active trigger and calculates the replica count each one requires.

Determine scaling need

The highest replica count across all triggers is used as the desired count.

Scale up

If more replicas are needed, Ryvn immediately provisions new replicas to handle the increased load.

Scale down

If fewer replicas are needed, Ryvn waits for a cool-down period before removing replicas to prevent rapid fluctuations.

Best practices

When implementing scaling for your applications, consider the following recommendations:

Initial setup

Start with manual scaling to understand your application’s baseline resource needs and traffic patterns. This information will help you set appropriate autoscaling parameters later.

Resource targets

Set your CPU target between 70-80% and memory target between 80-85%. These ranges provide good resource utilization while maintaining headroom for traffic spikes.

Queue-based scaling

For Temporal or RabbitMQ triggers, start with a conservative target value and adjust based on your workers’ throughput. Combine a queue trigger with a CPU/Memory trigger to handle both queue backlogs and resource-intensive workloads.

High availability

For production servers, configure a minimum of two replicas to maintain high availability. Consider your application’s cold start time when setting the minimum replica count.

Monitoring

Regularly review your scaling patterns and adjust thresholds based on actual usage and application performance metrics.

Monitoring and debugging

You can monitor your server’s scaling behavior through the Ryvn Dashboard, which provides real-time visibility into:

Current replica count and status
Scaling events history
CPU and memory utilization metrics
Queue depth and message rate for queue-based triggers

For advanced scaling scenarios or help with debugging scaling issues, please reach out to our support team.

Get started

Provision

Deploy

Concepts

Configure

Release

Templating

Observability

Networking

Experimental

Support

Scaling methods

Manual scaling

Autoscaling

Manual scaling

Autoscaling

Configuration

Trigger types

CPU / memory

Temporal queue

RabbitMQ queue

How autoscaling works

Best practices

Monitoring and debugging

Get started

Provision

Deploy

Concepts

Configure

Release

Templating

Observability

Networking

Experimental

Support

Documentation Index

​Scaling methods

Manual scaling

Autoscaling

​Manual scaling

​Autoscaling

​Configuration

​Trigger types

​CPU / memory

​Temporal queue

​RabbitMQ queue

​How autoscaling works

​Best practices

​Monitoring and debugging

Scaling methods

Manual scaling

Autoscaling

Configuration

Trigger types

CPU / memory

Temporal queue

RabbitMQ queue

How autoscaling works

Best practices

Monitoring and debugging