System Design Interview: Chapter 01

Disclaimer

The contents here are derived while reading the book "System Design Interview: An Insider’s Guide" . Only the important concepts that may be needed in future for reference will be mentioned here. Please go to the official book at System Design Interview Book for more detail.

Chapter - 01: From Zero to Million Users

Single User Request

A user may have a browser or an app on a mobile device to access a webserver. Following things happen when the user accesses the webserver.

  1. User inputs domain name. The domain name goes to the DNS. DNS returns an IP for the webserver.
  2. Once the IP is obtained, an HTTP request is made from the user to the webserver.
  3. The webserver responds with a HTML page or a JSON for the user to render.

Vertical Scaling & Horizonal Scaling

Load Balancers

When the volume of users and the number of requests increase, single server may be overloaded. A load balancer is a solution that balances the load and mitigates the problem users may face when single server is overloaded when many servers are available.

A load balancer sits between the user application or the web browser and the webserver. This means the request first go to the load balacer. The goal of load balancer is to evenly distribute incoming traffic among webservers that are defined in a load-balanced set .

One of the design practise is that only the load balancer is connected by a public IP address whereas the web servers are not connected through the public IP. The load balancers forwards the requests to the webservers to their private IP. Since the public IP of the web servers is not exposed, this is better from a security stand point.

If a web servers goes offline, the load balancer can automatically forward all the forth coming requests to remaining web servers.

Database and its replication

Its better to separate web server and the database infrastructure. This allows separate and independent scaling of the infrastructures by separating the scope of Data Tier and Web Tier .

Database replication is usually done with a master and slave relationship, where the (only one) master holds the original and the (multiple copies) slaves hold the copies.

Master generally only supports write operation. All the data modifying operations like insert, delete and update must be sent to the master.

Slaves handle the read operation. Since the ration of read to write operations is usually very high, there are multiple slave copies.

Caching

To serve the requests that are accessed more frequently, a cache tier is added into the system. A cache provides a temporary storage for frequently accessed data so that data is served more quickly. This also improves the performance of the database and/or webserver while also reducing load on them.

Cache Tier is a temporary, and is much faster than database. Having a separate cache tier allows independent scaling of the cache tier.

The Cache Tier sits between the Web Tier and Database Tier . When a request is received by the web server, the web server first checks the cache for the data. If the data is available, data is sent back to the client from the cache. If not, the database is quiried. Then, the content is stored in the cache, and sent to the client. This is the simplest read through caching. Other caching techniques are also available.

Content Delivery Network (CDNs)

Stateless Architecture

Message Queue

Database Scaling

Conclusion

Its an iterative process to handle million or more users. Following items must be taken into account: