Scaling and Fly
Fly Scaling Architecture
Scaling Dimensions
There are multiple dimensions of scaling on Fly.
- Creating new application instances on demand
- Ensuring the application has instances running in multiple regions
- Increasing the cpu cores and memory size of application instances
Scaling And Configuration
Fly scales within regions by creating more instances of the application as needed. That need is defined by the number of concurrent connections that an application has. The thresholds are defined in the fly.toml file under services.concurrency.
By default, when an application sees 20+ connections, a new instance of the application is started and new connections go to that instance. By adjusting the soft and hard limits of concurrency in the configuration file, you can set how many connections will trigger the creation of a new instance.
Where that instance will appear is down to the regional availability and Fly's auto-balancing mechanisms. These mechanisms decide where a new instance is created.
Regional Scaling
Scaling over regions is where you deploy your applications to different datacenters around the world. The region pool is a list of regions where the application is allowed to deploy. When an application is created, the first region in the pool is the region determined to be nearest to the user creating the application. You can confirm this by running flyctl regions list.
flyctl regions list
Region Pool:
lhr
Backup Region:
ams
fra
The create command, in this case, was issued in the UK, so London (LHR-London Heathrow) is the closest region. There is an exception to this rule. When Turboku applications are created they default to the region nearest to location of their source Heroku application: iad for US applications and ams for European applications).
Backup Regions
Notice also the list of Backup Regions. If for any reason, the application can't be deployed in LHR, Fly will attempt to bring it up in either AMS (Amsterdam) or FRA (Frankfurt). Users won't notice this as they will be directed to running instance automatically. Backup Regions are selected based on the Region Pool and the geographical closeness of other regions.
Scaling Modes
Regional scaling is based on a pool of regions where the application can be run. Using a selected model, the system will then create at least the minimum number of application instances across those regions. The model will then be able create instances up to the maximum count. The min and max are global parameters for the scaling. There are two scaling modes, Standard and Balanced.
Standard: Instances of the application, up to the minimum count, are evenly distributed among the regions in the pool. They are not relocated in response to traffic. New instances are added where there is demand, up to the maximum count.
Balanced: Instances of the application are, at first, evenly distributed among the regions in the pool up to the minimum count. Where traffic is high in a particular region, new instances will be created there and then, when the maximum count of instances has been used, instances will be moved from other regions to that region. This movement of instances is designed to balance supply of compute power with demand for it.
To determine what the current settings of an application are, run flyctl scale show:
flyctl scale show
Scale Mode: Standard
Min Count: 1
Max Count: 10
VM Size: micro-2x
This scaling plan sees standard, even distribution on instances, with a minimum of 1 instance and up to 10 instances that can be created on demand.
Modifying The Region Pool
To control which regions an application can be deployed to, the flyctl regions command has two more sub-commands - add and remove. Each take a space-separated list of regions and then, as required, add or remove them from the region pool. The add command also sets the scaling plan's minimum count of instances to the number of regions in the pool, to save having to adjust it. Note, it only adjusts the value upwards so if you remove regions, you will have to manually reset the minimum count.
Modifying The Scaling Plan
As mentioned above, the scaling mode controls how the regions in the pool are used for allocating instances. To set the mode use:
flyctl scale standard
or
flyctl scale balanced
Both of these commands set the scaling mode and can take extra settings that tune the mode, specifically setting the minimum count (min) and maximum count (max) of instances. For example, to set balanced mode with a minimum number of instances of 5, you would give this command:
flyctl scale balanced min=5
Want to set a maximum of 10 too? Then do this:
flyctl scale balanced min=5 max=10
If you just want to set the max or min for the currently selected model use the set sub-command:
flyctl scale set min=5 max=10
Viewing The Application's Scaled Status
To view where the instances of a Fly application are currently running, use flyctl status:
flyctl status
App
Name = hellofly
Owner = dj
Version = 299
Status = running
Hostname = hellofly.fly.dev
Deployment Status
ID = 59b60abf-ba4f-fb2f-9f78-35a249e2bef5
Version = v299
Status = successful
Description = Deployment completed successfully
Allocations = 3 desired, 3 placed, 3 healthy, 0 unhealthy
Allocations
ID VERSION REGION DESIRED STATUS HEALTH CHECKS CREATED
8a9358d1 299 ams run running 1 passing 15m36s ago
7c08ce47 299 nrt run running 1 passing 15m36s ago
1b17a5e6 299 sjc run running 1 passing 15m36s ago
If a region is listed with (b) following it, that means the region being used is a backup region in use.
Scaling Virtual Machines
Each application instance on Fly runs in its own virtual machine. The number of cores and memory available in the virtual machine can be set for all application instances using the flyctl scale vm command.
Viewing The Current VM Size
Using flyctl scale vm on its own will display the details of the application's current VM sizing.
flyctl scale vm
Size: micro-1x
CPU Cores: 0.12
Memory: 128 MB
Price (Month): $2.670000
Price (Second): $0.000001
It shows the size name (micro-1x), number of CPU cores, memory (in GB and MB), estimated price per month (if an instance was kept running for a month) and price per second (if an instance was only brought up on demand).
Viewing Available VM Sizes
The flyctl platform vm-sizes command will display the various sizes with cores and memory and current pricing:
flyctl platform vm-sizes
NAME CPU CORES MEMORY PRICE (SECOND) PRICE (MONTH)
micro-1x 0.12 128 MB $0.000001 $2.670000
micro-2x 0.25 512 MB $0.000003 $8.000000
cpu1mem1 1 1 GB $0.000013 $35.000000
cpu2mem2 2 2 GB $0.000027 $70.000000
cpu4mem4 4 4 GB $0.000053 $140.000000
cpu8mem8 8 8 GB $0.000107 $280.000000
Note: This pricing is correct as of writing (March 2020), run flyctl platform vm-sizes to get the most current pricing.
The CPU Cores column shows how many vCPU cores will be allocated to the virtual machine. Lower than 1, the value reflects the proportion of a shared core that the VM will have available. Greater than 1, it represents the number of cores (from a pool of hyper-threaded cores) that will be available to the VM.
Setting VM Size For An App
Setting the size of the VM is handled by adding the required size name to flyctl scale vm. For example, if we want to double the VM size for our application, from micro-1x to micro-2x, we would run:
flyctl scale vm micro-2x
Scaled VM size to micro-2x
CPU Cores: 0.25
Memory: 512 MB
Price (Month): $8.000000
Price (Second): $0.000003
Flyctl responds with the sizes and pricing for a single new instance. All existing instances will be restarted at this new size.