1_rUTWA1v4B1kOXsYJ2XDbTw

How to Scale to 20 million+ Users on AWS?

The Scaling Ladder: From 1 to 20 Million Users on AWS

Overview

In this article we are going to talk about scaling your application infrastructure from 1 user to more than 20 million+ users. Whether you are new to cloud computing or seeking to understand the nuances of AWS services, this guide will provide valuable insights. While the focus is on AWS, the core principles of scalability, reliability, and performance are universally applicable across any cloud provider.

Joel Williams, Amazon Web Services Solutions Architect, gave an excellent talk on just that subject: AWS re:Invent 2015 Scaling Up to Your First 10 Million Users — this guide is inspired by the foundational concepts discussed in talks.

Let's start with basics of AWS first.

Basics of AWS

AWS operates on a global infrastructure model with three key layers:

Regions → Availability Zones → Edge Locations

Regions

Regions are geographically isolated areas containing multiple data centers.
The purpose is to provide data sovereignty, latency reduction, disaster recovery.
Each region operates independently with its own power grid, networking, and cooling systems.
As of now in 2025 AWS spans within 37 Geographic Regions.

Availability Zones (AZs)

AZs are physically separated data centers within a region (typically 3–6 per region). They are generally a single datacenter but could also be constructed by more than one within a region.
AZs provide isolated failure domains (separate power, networking, facilities).
Currently in 2025 there are 117 Availability Zones in total.
Each AZ is separate enough and they are operated separately in terms of power, internet connectivity and other resources.
The only connection between AZs is a low latency network. AZs can be 5 or 100 miles apart. But network between them is fast enough to feel like they are almost next to each other and they act like they are in same datacenter.
Each region has at least two AZ. The upper limit could be more than that.
We'll see how using AZs it's possible to create a high availability architecture for your application.
Another feature within AZs is their ability to synchronous replicate data, which is highly useful and we'll leverage in future as well.

Edge Locations

AWS has 100s of edge locations around the world.
An Edge location is used by CloudFront, Amazon's Content Distribution Network (CDN) and Route53, Amazon's managed DNS server.
An Edge location enables users to access content with a very low latency no matter where they are in the world.

Building Block Services for users:

Multi-AZ Deployment:

Region: us-east-1
|── AZ-1a: Primary DB, Load Balancer
|── AZ-1b: Standby DB, App Servers
|── AZ-1c: App Servers, Cache Layer

AWS has created a number of services that use multiple AZs internally to be highly available and fault tolerant.
This way we can directly use them without having us to manage their availability and robustness.

A highly scalable architecture can be created using services even though they exist within a single AZ.

From a Single User to Your First Thousand

1 User: A Humble Beginning

In this scenario you are the only user and you want the website to keep running. Your architecture will look something like this:

Run on a single instance maybe a type t2.micro. Instance type comprise varying combinations of CPU, memory, storage and networking capacity and it give us the flexibility to choose the appropriate mix of resources for your application based on our need.
The one instance would run the entire application. For eg: web application, database, etc.
To expose the application to end users we can use Amazon Route 53 as DNS.

The First Scaling Challenge: Vertical Scaling

A time will come when you will need a bigger box. Simplest approach to scaling is choose a larger instance type. Maybe a c4.8xlarge or m3.2xlarge, for example. Now, this approach is called vertical scaling.

How it works:

You simply increase the size and power of your EC2 instance.
This involves stopping the instance, selecting a more powerful instance type (e.g., from a t2.micro to an m5.large), and restarting it.
There is a wide mix of different hardware configurations to choose from. You can have a system with 244 gigs of RAM. Or one with 40 cores. There are High I/O instances, High CPU Instances, High storage instances.
Some Amazon services come with a Provisioned IOPS option to guarantee performance. The idea is you can perhaps use a smaller instance type for your service and make use of Amazon services like DynamoDB that can deliver scalable services so you don't have to.

But vertical scaling has a big problem:

there's no failover, no redundancy. If the instance has a problem your website will die. All your eggs are in one basket.
Eventually a single instance can only get so big. You need to do something else.

Growing Pains: Decoupling the Database (10+ Users)

Now it's time to separate out a single host into multiple hosts:

One host for the website and another host for the database.
Run any database you want, but you are on the hook for the database administration.
Using separate hosts allows the web site and the database to be scaled independently of each other.
Perhaps your database will need a bigger machine than your web site, for example.
Or instead of running your own database you could use a database service.

👉 If you are a database admin and are afraid of backups and availability you can use a database service instead.

A big advantage of using a service is you can have a multi-AZ database setup with a single click. You won't have to worry about replication or any of that sort of thing. Your database will be highly available and reliable.

Amazon has several fully managed database services, and Amazon RDS (Relational Database Service) is the most popular one.

Start with a SQL database instead of a NoSQL database.
The suggestion is to start with a SQL database because the technology is established. There's lots of existing code, communities, support groups, books, and tools. You aren't going to break a SQL database with your first 10 million users. Not even close. (unless your data is huge).
With SQL there are clear patterns to scalability and it is extremely good when you have relational data.

When might you need to start with a NoSQL database?

If you need to store > 5 TB of data in year one.
If you have an incredibly data intensive workload.
If your application has super low-latency requirements and really high throughput (e.g., a chat application might use Cassandra).
If you need to really tweak the IOs on reads/writes.
Another use case is if you strictly don't have any relational data.

Users > 100

Use a separate host for the web tier.
Move the database on Amazon RDS. It takes care of everything and you don't need to worry.
Moving to RDS at this stage will allow us to spend our energy on rolling new updates for our application without worrying much about the database.

Users > 1000

Currently your application has availability issues. If the host for your web service fails then your web site goes down — very bad first impression.

A single instance for your web tier is a single point of failure.
You need another web instance in another Availability Zone.
Latency between AZs is in the low single digit milliseconds, almost like they are right next to each other.

Multi-AZ Deployment:

Deploy at least two web server instances in separate AZs.

Database Tier:

Configure RDS for a Multi-AZ deployment.
With a single click, RDS will provision and maintain a synchronous standby replica in a different AZ.
In case of a primary database failure, RDS automatically fails over to the standby replica.
Your application continues to function using the same database endpoint.

Elastic Load Balancer (ELB)

ELB is a highly available managed load balancer.
The ELB exists in all AZs.
It's a single DNS endpoint for your application. Just put it in Route 53 and it will load balance across your web host instances.
The ELB has Health Checks that make sure traffic doesn't flow to failed hosts.
It scales without your doing anything. If it sees additional traffic it scales behind the scenes both horizontally and vertically.

Offloading for Performance — Users > 10,000s – 100,000s

As traffic increases, you need to optimize performance by reducing the load on your web servers and database.

Currently you have at least 2 instances behind the ELB, in practice you can have 1000s of instances behind the ELB (horizontal scaling).
Add more read replicas to the database in RDS to take load off the write master.
Move static content in your web app to Amazon S3 and Amazon CloudFront.
S3 is an object store, not a block store — great for static content (JS, CSS, images, videos).
CloudFront caches content at edge locations (53+ globally).

Other techniques:

Shift session state off your web tier (to ElastiCache or DynamoDB).
Cache data from your DB into ElastiCache to reduce load.

Amazon ElastiCache

Managed Memcached or Redis.
Self-healing infrastructure, auto scaling.
Crucial for 10–100k users.

Automating Scalability: Auto Scaling

If you provision enough capacity to always handle your peak traffic load, you are wasting money.

Auto Scaling lets you match compute power with demand.
Define min and max pool size.
CloudWatch metrics drive scaling (CPU, latency, network traffic, custom metrics).

A Robust Architecture for Users > 500,000+

At this stage:

Architecture should be highly available, scalable, and observable.
Auto scaling groups on the web tier (across 2–3 AZs, up to thousands of instances).
ElastiCache and DynamoDB are critical for offloading DB.
Add monitoring, metrics and logging (CloudWatch, CloudTrail).
End-User Experience Monitoring (New Relic, Pingdom).

Embracing Automation and Decoupling

Manual management becomes impractical.

Infrastructure as Code (IaC)

AWS CloudFormation (JSON/YAML templates).
AWS Elastic Beanstalk (PaaS).
AWS OpsWorks (Chef/Puppet).
AWS CodeDeploy (fleet deployments, integrates with Auto Scaling, Chef, Puppet).

Decoupling with Microservices and Queues

Break monolith into SOA/Microservices.
Use Amazon SQS for message queues.
Use AWS Lambda for event-driven serverless compute (S3 triggers, queue triggers).

Don't Reinvent the Wheel

Only invest in tasks that differentiate you as a business.

Many AWS services are inherently fault tolerant (queuing, email, transcoding, search, databases, monitoring, metrics, logging, compute).
Don't build them yourself unless necessary.

Users > 1,000,000+

Reaching a million users requires:

Multi-AZ
Elastic Load Balancing (all tiers)
Auto Scaling
SOA
Serve content smartly with S3 + CloudFront
Cache in front of DB
Move state off web tier
Use Amazon SES for email
Use CloudWatch for monitoring

Conquering the Database Bottleneck — Users > 10,000,000+

At this scale, primary bottleneck = write capacity of master DB.

Strategies:

Federation

Split DB by function (Forums DB, User DB, Products DB).
Scales independently, but no cross-DB queries.

Sharding

Split dataset across multiple hosts.
Application-layer complexity.
No practical scalability limit.

Other DB Types

Move workloads to NoSQL, Graph, etc.
Great for clickstream/logs, leaderboards, hot tables, metadata/lookup tables.

The Road to 20 Million Users and Beyond

Scaling is continuous.

Fine tune application.
More SOA features.
From Multi-AZ → Multi-region.
Build custom solutions for unique problems.
Deep analysis of full stack.

In Summary: Key Principles for Scalable Architectures

Use a multi-AZ infrastructure for reliability.
Make use of self-scaling services (ELB, S3, SQS, SNS, DynamoDB).
Build redundancy at every level.
Start with a traditional relational SQL database.
Cache data both inside and outside infra.
Use IaC + automation tools.
Have good metrics/monitoring/logging.
Split tiers into SOA for independent scaling/failure.
Use Auto Scaling when ready.
Don't reinvent the wheel — use managed services.
Move to NoSQL if and when it makes sense.

We have now discussed steps for scaling web applications to millions of users all using AWS. By following these principles and progressively evolving your architecture, you can build a robust and scalable system capable of supporting millions of users.

Hope you liked it and had something to learn. Thanks!

How to Scale to 20 million+ Users on AWS?

How to Scale to 20 million+ Users on AWS?

The Scaling Ladder: From 1 to 20 Million Users on AWS

Overview

Basics of AWS

Regions

Availability Zones (AZs)

Edge Locations

Building Block Services for users:

Multi-AZ Deployment:

From a Single User to Your First Thousand

1 User: A Humble Beginning

The First Scaling Challenge: Vertical Scaling

Growing Pains: Decoupling the Database (10+ Users)

Users > 100

Users > 1000

Multi-AZ Deployment:

Database Tier:

Elastic Load Balancer (ELB)

Offloading for Performance — Users > 10,000s – 100,000s

Amazon ElastiCache

Automating Scalability: Auto Scaling

A Robust Architecture for Users > 500,000+

Embracing Automation and Decoupling

Infrastructure as Code (IaC)

Decoupling with Microservices and Queues

Don't Reinvent the Wheel

Users > 1,000,000+

Conquering the Database Bottleneck — Users > 10,000,000+

Strategies:

Federation

Sharding

Other DB Types

The Road to 20 Million Users and Beyond

In Summary: Key Principles for Scalable Architectures

Tags

Share this post

Comments (0)

Leave a Comment