Building A Large Scalable System

Most applications are developed using three tier architecture which consists of presentation layer, business logic layer, and data access layer. The presentation layer contains aspx, html or jsp page and business layer contains services like WCF, Web API or web services and data layer contains code to communicate to databases where the actual data resides.

This is how the application architecture looks.


How scalability comes into picture

The web application is up and running, users are happy and business getting revenue --  everything goes fine when the business is very small, now gradually the users’ level increases to the applications and the traffic to the applications became huge and web applications became very slow. When the application is very slow no user is interested in using the applications and the business loses revenue and reputation, and losing the business is losing everything.

This is where the scalability comes into the  picturel how to extend the system to serve significantly high volumes of users. Scalability is not the same as performance scalability is not code issue, it is how we need  to extend the application in multiple servers, multiple database, multiple location to server millions users.

When designing any system there are some key considerations which developers and architects should keep in mind. These are: 

  • Scalability – The number of users system/ session/transaction/ operations it can support.

  • Performance – The system should use optimal utilization of resources like CPU, Thread, memory.

  • Responsiveness – The time taken per operation should be less. Example: User should not wait a long time to get the information from server.  Example -If we are booking tickets and it is very slow in doing a transactions we think  what a bad application.

  • Availability – The system should be the available at any given point in time. If not fully, it should be partially available ensuring that end users think the system is available.

  • Downtime Impact - The impact of the downtime of a server/service/resource - number of users, type of impact should be very minimal.

  • Cost – Cost for the system should be within the budget. More cost of the system does not give profit to the organization.

  • Maintenance Effort- System must have the very litlle maintenance effort. For example if the system is developed now it must have the features to extend or enhance the feature very easily. 

There are some key considerations when designing scalable systems, these considerations are: 

  • Vertical scaling
  • Horizontal Scaling
  • Horizontal Partitioning
  • Vertical Partitioning
  • Load balancing
  • Master-Slave setup
  • Distributed Caching
  • Use NOSQL
  • Incremental model development 

Vertical scaling

Vertical scaling means adding hardware to the system i.e., RAM, CPU, processors into the existing machine to increase the processing time in the server. In a virtual machine set up it can be configured virtually instead of adding real physical machines. When increasing the hardware resources we should not change the number of nodes. This is referred to as “Scaling up” the Server.

As an advantage it is simple to implement but as a disadvantage how much hardware we can add to it has a finite limit. Hardware does not scale linearly (diminishing returns for each incremental unit). Adding hardware requires downtime.

Horizontal Scaling

The horizontal scaling means adding more web servers through Load Balancing to the system similar to earlier one. Now multiple machines work together to give quick response and availability of any system including database. After adding multiple machines now we have multiple machines to distribute the work load from time to time.

Each machine works as a different node which is identical in nature here and sometimes multiple nodes are treated as a cluster of servers. This is referred to as “Scaling Out” of the web server.

As an advantage we have multiple servers which can distribute the user traffic, now the synchronization of codes, session management, caching data should be in a proper way to the user.

Horizontal Partitioning

Horizontal partitioning partitions or segments rows into multiple tables with the same columns.

E.g. of horizontal partitioning :- customers with city ABC codes are stored in Customers ABC, while customers with customer city XYZ are stored in Customers XYZ. Here the two partitioning tables can be ABC and XYZ .

This way Database partitioning by value from the beginning into your design is a good approach.

Vertical Partitioning

The term Vertical Partitioning denotes increasing the number of nodes by distributing the tasks/functions. Each node (or cluster) performs separate tasks  different from the other. Vertical Partitioning can be performed at various layers,this may be at application /server / Data / Hardware levels. These are Task-based specialization, reduces context switching and we can do the optimization and tuning as much possible. Instead of putting everything into one box put  it into different boxes. In Database tables consider we have customer, orders, customer order, and order status in one DB, we can move some of these into another DB.

Load balancing

Load balancing is the process of serving a user his request towards one server that is part of a server farm. In this way the user load is distributed amongst several servers.


There are two kinds of load balancers.  (Hardware Load balancers and Software Load balancers). Hardware Load balancers are faster whereas Software Load balancers are more customizable.

We should keep in mind when coding for a system that has load balancing.

Do not program depending on a cache or session data that is written to the server or local file system of the server, do not rely on the file system at all.

Let us assume we have different servers and user1 is hitting the request and this request is served by server1 and  server1 goes down, now the load balancer will redirect the request to server2 but how the session data in server1 will be passed to server2. This issue is called a sticky session.

For proper session management we should use a centralized session management where multiple servers can read session data .SQL server as session state mode is used  for most of .NET based web applications in a large systems.

Scaling from a single DB server to a Master-Slave setup

Isolating the Database based on the purpose gives better performance in scalable systems.

As an example earlier we had one database which was used for getting different SSRS reports and crystal reports, same was used for different SQL Jobs, Windows Services, Email Message communication, all transaction data etc. We moved to different Master-slave databases for better results. All transaction data as written are sent to a single master who replicates the data to multiple slave nodes. Almost all the RDBMS MySQL, MSSQL and Oracle support native replication.


Effective caching is a key to performance in any distributed systems. To make a highly scalable system the caching should be a distributed caching which may span multiple servers. The cache data may grow from time to time but there should be an effective way to handle it.

NCache/ Velocity/AppFabric are some of the good distributed caching tools options in a .NET large scale application. The cache information is stored as a cluster of nodes and all have the feature of replicating and locating information for faster access.


NoSQL databases give advantages with scalability, availability, and zero downtime. They store the data in multiple file nodes which can be easily accessed and replicated as needed.

Some of the NoSQL tools are Cassandra, MongoDB, and CouchDB etc.

Incremental model development

Inspect the issues, Change as needed and adopt this is the key for building of a scalable system.

Write automated builds using Jenkin, Team city or TFS to make a build automated with 100% test coverages. Proper testing to  changes in Database, codes, configurations in the test or UAT environment is needed before pushing the code into production. From day one of development designer/architects/developers should think of developing loosely coupled modules. Choosing a proper platform and language considerations are also a big factor for building large systems.

As Applications need be able to scale in distributed environments with a number of servers these incremental model development steps help to a large extent.

We cannot bring scalable systems in a single day as “Rome was not build in a day,” it is a collaboration and  great team work among developers, architects, QA, infrastructure, and dev ops to build a highly scalable systems.

Hope you got some information about how to scale a large application. Thanks for reading.

Read more articles on Design Patterns:

Up Next
    Ebook Download
    View all
    View all