0% found this document useful (0 votes)
6 views

Cheatsheet System Design

This cheat sheet provides a quick reference for key system design concepts, including use cases, design questions, and potential solutions. It covers various components like API gateways, load balancers, databases, and caching systems, along with their benefits and caveats. The document serves as a guide for preparing for system design interviews by summarizing essential tools and strategies for different scenarios.

Uploaded by

sumandeep82188
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Cheatsheet System Design

This cheat sheet provides a quick reference for key system design concepts, including use cases, design questions, and potential solutions. It covers various components like API gateways, load balancers, databases, and caching systems, along with their benefits and caveats. The document serves as a guide for preparing for system design interviews by summarizing essential tools and strategies for different scenarios.

Uploaded by

sumandeep82188
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

SYSTEM DESIGN CHEAT SHEET

This quick reference quick sheet I made covers the most important system design concepts, making it easy to review key points
right before your interview.

Use Cases/Problems System Design Component What it solves Caveats/Issues Mitigations Examples of Tools
Questions
- Unified API access: - Design an API API Gateway Single entry point, Can become a - Use multiple gateways Kong, Apigee, AWS
Centralizes client requests. gateway for manages bottleneck, adds with load balancing. API Gateway
microservices. authentication latency.
- Security: Manages - Implement secure and routing. - Implement rate limiting
authentication and and scalable API and caching.
authorization. access.
- Use circuit breakers and
retries.
- High traffic websites: - Design a scalable Load Distributes traffic Single point of - Use multiple load Nginx, HAProxy,
Ensures uptime and web application. Balancer across across workers, failure, adds balancers in different AWS ELB
balances load. multiple improves complexity. regions.
- Scalable APIs: Distributes - Build a highly redundant reliability and - Implement health checks.
incoming requests. available online workers availability.
service.
- Use DNS-based load
balancing.
- Financial transactions: - Design a financial SQL Database Strong ACID Limited - Implement sharding. MySQL,
Requires ACID compliance. transaction system. properties, scalability, PostgreSQL, MS
structured data, schema SQL Server
- Complex queries: Needs - Create a scalable complex queries. management. - Use read replicas.
structured and relational relational database.
data.
- Employ clustering and
partitioning.

1
- Large-scale data: Supports - Design a large-scale NoSQL Database Flexible schema, Eventual - Use consistency settings MongoDB,
horizontal scaling. user profile store. horizontal consistency, (e.g., quorum Cassandra,
scalability, high limited reads/writes). DynamoDB
- Unstructured data: Flexible - Create a scalable performance. transaction - Design for idempotent
schema adapts to changes. data storage support. operations.
solution.
- Implement conflict
resolution strategies.
- High availability: Ensures - Design a data Data Replication Ensures data Increases costs, - Use asynchronous AWS RDS standby
data is replicated and replication strategy. durability, to consistency replication. (synchronous), AWS
available. - Implement a highly ensure system issues. - Implement conflict RDS Read Replicas
available database availability. resolution. (asynchronous),
system. MongoDB Replica
- Use multi-master Set (asynchronous)
replication.
- High read load: Reduces - Design a high- Cache Reduces latency, Cache - Implement cache Redis, Memcached
latency for frequent reads. performance caching decreases load on consistency invalidation strategies.
layer. databases. issues, potential
- Session storage: Speeds up - Optimize read- for stale data. - Use Time-to-Live (TTL)
access to session data. heavy workload. settings.

- Employ write-through or
write-back caching.
- Real-time analytics: - Design a real-time In-Memory Extremely fast Volatile storage, - Enable persistence Redis, Memcached
Requires fast data access. analytics system. Database data retrieval, high memory options.
reduces latency. cost.
- Leaderboards: High-speed - Create a fast - Use hybrid storage
data retrieval is crucial. leaderboard service. models (in-memory + disk).

- Implement data backup


strategies.

2
- Event streaming: Manages - Design a real-time Message Broker Facilitates Bottleneck - Use scalable brokers with Apache Kafka,
high-throughput data event streaming message potential, partitions. RabbitMQ,
streams. platform. exchange, delivery ActiveMQ
supports multiple guarantees.
- Real-time processing: - Implement a patterns. - Implement backpressure
Facilitates real-time data reliable messaging handling.
flows. system.
- Monitor message broker
performance.
- Event-driven systems: - Design an event- Distributed Manages Message - Use message brokers Apache Kafka,
Manages asynchronous driven architecture. Queue asynchronous ordering and with strong ordering RabbitMQ, AWS
events. communication, delivery guarantees. SQS
- Microservices: Decouples - Create a reliable decouples guarantees. - Implement idempotent
service communication. task processing components. message processing.
system.
- Use message
deduplication techniques.
- Large applications: - Design a scalable Microservices Improves Increased - Use service meshes. Docker,
Enhances modularity and microservices modularity, communication Kubernetes, Istio
scalability. architecture. independent complexity.
- Continuous delivery: - Build a modular, deployment. - Implement standardized
Facilitates independent independently APIs.
deployment. deployable system.
- Use centralized logging
and monitoring.
- Microservices: Enables - Design a service Service Registry Tracks services High availability - Use distributed service Consul, Eureka,
service discovery. discovery and their required, registries. Zookeeper
mechanism. instances. consistency

3
- Dynamic environments: - Implement dynamic issues. - Implement regular health
Tracks changing service service registration. checks.
instances.

- Use consensus algorithms


for consistency.
- Content-heavy sites: - Design a content CDN (Content Reduces latency, Cache - Implement cache purging Cloudflare, Akamai,
Improves load times for delivery system. Delivery improves load invalidation strategies. AWS CloudFront
users. Network) times. complexity,
- Global reach: Distributes - Optimize a global cost. - Use regional CDNs.
content across regions. website’s
performance.
- Monitor CDN
performance and hit rates.
- Business intelligence: - Design a data Data Warehouse Centralizes data, High storage - Use data compression Amazon Redshift,
Centralizes analytics data. warehouse for supports complex and and partitioning. Snowflake, Google
analytics. queries. maintenance BigQuery
- Historical analysis: - Build a scalable costs. - Implement data lifecycle
Supports complex querying business intelligence management.
over large datasets. platform.

- Use cloud-based, scalable


data warehouses.
- E-commerce sites: Provides - Design a product Search Engine Enables fast Indexing and - Implement efficient Elasticsearch, Solr,
fast product search. search system. search over large maintenance indexing strategies. Algolia
datasets. required.
- Large datasets: Enables - Implement a - Use distributed search
full-text search over scalable search architectures.
extensive data. solution.
- Optimize search queries
and relevance.

4
- Media storage: Handles - Design a scalable File Storage Scales with data Backup and - Use distributed file AWS S3, Google
large files like images and file storage system. growth, handles redundancy systems. Cloud Storage,
videos. unstructured required, HDFS
- Backup solutions: Stores - Implement a data. retrieval - Implement multi-region
and retrieves backups. reliable backup latency. replication.
solution.
- Use lifecycle policies for
data management.
- Data warehousing: - Design an ETL ETL Pipeline Facilitates data Complex to - Use managed ETL Apache Nifi, AWS
Prepares data for analysis. pipeline for a data integration and build and services. Glue, Talend
warehouse. analysis. maintain.
- Data migration: Transforms - Build a reliable data - Implement monitoring
data from multiple sources. integration system. and error handling.

- Use data validation and


transformation tools.
- System reliability: Monitors - Design a system Monitoring Tracks system High overhead, - Use threshold tuning and Prometheus,
uptime and performance. monitoring solution. System health, enables potential noise. anomaly detection. Grafana, Datadog
alerting.
- Issue detection: Alerts for - Implement an - Implement efficient data
anomalies and failures. alerting and collection.
dashboard system.
- Use centralized
monitoring dashboards.
- Debugging: Captures logs - Design a centralized Logging System Aids in auditing Large data - Use log rotation and ELK Stack, Splunk,
for issue diagnosis. logging system. and volumes, retention policies. Fluentd
- Compliance: Maintains - Implement a troubleshooting. storage and - Implement centralized
audit trails. scalable logging and querying. logging.
analysis solution.
- Optimize log storage and
indexing.

5
- Secure applications: - Design a secure Authentication Enhances security, Single point of - Use multi-factor OAuth, Okta, Auth0
Manages user identity and authentication Service manages user failure, security authentication.
access. system. authentication. measures
- Single sign-on: Centralizes - Implement a single needed. - Implement redundancy
authentication across sign-on solution. and failover.
services.

- Use secure token storage


and management.
- Containerized apps: - Design a container Orchestration Automates Adds - Use managed Kubernetes, Docker
Automates container orchestration Tool deployment and complexity, orchestration services. Swarm, Mesos
management. system. management. learning curve.
- Microservices: Coordinates - Implement a CI/CD - Implement robust CI/CD
service deployments. pipeline for pipelines.
microservices.
- Use monitoring and
scaling tools.
- Dynamic applications: - Design a Configuration Centralizes Single point of - Use distributed Consul, etcd, Spring
Centralizes config changes. configuration Service configuration failure, secure configuration stores. Cloud Config
management management. access needed.
system.
- Large systems: Manages - Implement dynamic - Implement encryption for
configurations across configuration sensitive data.
services. updates.
- Use versioning and
rollback mechanisms.
- Real-time dashboards: - Design a real-time Real-Time Data Enables real-time High - Use stream processing Apache Flink,
Aggregates live data feeds. analytics system. Aggregation analytics and complexity, data frameworks. Apache Storm, AWS
monitoring. velocity issues. Kinesis

6
- Monitoring: Provides - Implement a live - Implement windowing
instant insights from data data aggregation and aggregation
streams. platform. techniques.
- Monitor and scale
processing infrastructure.
- Microservices: Tracks - Design a distributed Distributed Aids in debugging High overhead, - Use sampling to reduce Jaeger, Zipkin,
requests across services. tracing system. Tracing and performance integration overhead. OpenTracing
- Performance tuning: - Implement monitoring. required. - Implement efficient trace
Identifies bottlenecks and performance storage.
delays. monitoring for
microservices.
- Use correlation IDs for
request tracking.
- Fault tolerance: Prevents - Design a fault- Circuit Breaker Protects services Adds - Use monitoring tools to Hystrix,
system overloads. tolerant from cascading complexity, detect failures. Resilience4j, Istio
microservices failures. tuning needed.
system.
- Resilient services: Isolates - Implement circuit - Implement fallback
failures in microservices. breakers for service strategies.
reliability.
- Use retries and
exponential backoff.
- API management: Protects - Design an API rate Rate Limiter Controls request Can impact user - Use dynamic rate Kong, Envoy, Nginx
against request floods. limiting system. rate, prevents experience. limiting.
abuse.
- Fair resource allocation: - Implement a fair - Implement user-based
Ensures fair usage policies. resource allocation quotas.
mechanism.
- Use monitoring to adjust
limits.

7
- Periodic tasks: Automates - Design a job Scheduler Manages Requires - Use distributed Apache Airflow,
recurring jobs. scheduling system. background jobs monitoring, can schedulers. Celery, Kubernetes
and tasks. become CronJobs
- Batch processing: Manages - Implement a bottleneck. - Implement job
large data processing tasks. reliable task prioritization.
processing system.
- Use monitoring and retry
mechanisms.
- Microservices: Handles - Design a service Service Mesh Manages Adds - Use managed service Istio, Linkerd,
inter-service mesh for microservices operational meshes. Consul Connect
communication. microservices. communication. complexity.
- Observability: Provides - Implement - Implement automation
insights into service observability for tools.
interactions. service interactions.
- Use monitoring and
observability tools.
- Disaster recovery: Ensures - Design a backup Data Backup and Ensures data Resource- - Use automated backup Use native backup
data is safe and recoverable. and recovery system. Recovery durability, intensive, solutions. capabilites of the
protects against regular testing data store, or
- Data integrity: Maintains - Implement a data loss. needed. - Implement multi-region centralized backup
backups for compliance. reliable disaster Increases costs. storage. products like AWS
recovery solution. if backups are Backup, Google
not up to date - Regularly test backup and Cloud Backup,
there will be recovery processes. Veeam
data loss.
Backup may not
be accessible.
- Social networks: Models - Design a social Graph Database Efficiently handles Steep learning - Use graph-specific Neo4j, Amazon
complex relationships. network graph graph-based data curve, non- optimizations. Neptune, OrientDB
database. and relationships. graph query

8
- Recommendation engines: - Implement a inefficiency. - Implement hybrid models
Analyzes connected data. recommendation for different data types.
engine.
- Use indexing and caching
for performance.
- Big data analytics: Stores - Design a big data Data Lake Supports diverse Governance - Use metadata AWS Lake
and processes vast data. analytics platform. data types and required, risk of management. Formation, Azure
analytics. becoming data Data Lake, Hadoop
- Data warehousing: - Implement a data swamp. - Implement data
Prepares raw data for lake for diverse data cataloging.
analytics. types.
- Use data lifecycle
policies.
- Event-driven architectures: - Design a real-time Data Streaming Facilitates real- High - Use managed streaming
Processes data streams in data streaming Platform time data operational services.
real-time. system. processing. complexity.

- Analytics: Real-time - Implement an - Implement scaling


insights from continuous event-driven strategies.
data flow. architecture.
- Monitor and optimize
processing.

9
System Design Cheat sheet
Picking the right architecture = Picking the right battles + Managing trade-offs

Basic Steps

1. Clarify and agree on the scope of the system


 User cases (description of sequences of events that, taken together, lead to a
system doing something useful)
o Who is going to use it?
o How are they going to use it?
 Constraints
o Mainly identify traffic and data handling constraints at scale.
o Scale of the system such as requests per second, requests types, data
written per second, data read per second)
o Special system requirements such as multi-threading, read or write
oriented.
2. High level architecture design (Abstract design)
 Sketch the important components and connections between them, but don't
go into some details.
o Application service layer (serves the requests)
o List different services required.
o Data Storage layer
o eg. Usually a scalable system includes webserver (load balancer),
service (service partition), database (master/slave database cluster)
and caching systems.
3. Component Design
 Component + specific APIs required for each of them.
 Object oriented design for functionalities.

1
o Map features to modules: One scenario for one module.
o Consider the relationships among modules:
 Certain functions must have unique instance (Singletons)
 Core object can be made up of many other objects
(composition).
 One object is another object (inheritance)
 Database schema design.
4. Understanding Bottlenecks
 Perhaps your system needs a load balancer and many machines behind it to
handle the user requests. * Or maybe the data is so huge that you need to
distribute your database on multiple machines. What are some of the
downsides that occur from doing that?
 Is the database too slow and does it need some in-memory caching?
5. Scaling your abstract design
 Vertical scaling
o You scale by adding more power (CPU, RAM) to your existing machine.
 Horizontal scaling
o You scale by adding more machines into your pool of resources.
 Caching
o Load balancing helps you scale horizontally across an ever-increasing
number of servers, but caching will enable you to make vastly better
use of the resources you already have, as well as making otherwise
unattainable product requirements feasible.
o Application caching requires explicit integration in the application code
itself. Usually it will check if a value is in the cache; if not, retrieve the
value from the database.
o Database caching tends to be "free". When you flip your database on,
you're going to get some level of default configuration which will
provide some degree of caching and performance. Those initial settings

2
will be optimized for a generic usecase, and by tweaking them to your
system's access patterns you can generally squeeze a great deal of
performance improvement.
o In-memory caches are most potent in terms of raw performance. This is
because they store their entire set of data in memory and accesses to
RAM are orders of magnitude faster than those to disk. eg. Memcached
or Redis.
o eg. Precalculating results (e.g. the number of visits from each referring
domain for the previous day),
o eg. Pre-generating expensive indexes (e.g. suggested stories based on a
user's click history)
o eg. Storing copies of frequently accessed data in a faster backend (e.g.
Memcache instead of PostgreSQL.
 Load balancing
o Public servers of a scalable web service are hidden behind a load
balancer. This load balancer evenly distributes load (requests from your
users) onto your group/cluster of application servers.
o Types: Smart client (hard to get it perfect), Hardware load balancers
($$$ but reliable), Software load balancers (hybrid - works for most
systems)

 Database replication
o Database replication is the frequent electronic copying data from a
database in one computer or server to a database in another so that all
users share the same level of information. The result is a distributed

3
database in which users can access data relevant to their tasks without
interfering with the work of others. The implementation of database
replication for the purpose of eliminating data ambiguity or
inconsistency among users is known as normalization.
 Database partitioning
o Partitioning of relational data usually refers to decomposing your tables
either row-wise (horizontally) or column-wise (vertically).
 Map-Reduce
o For sufficiently small systems you can often get away with adhoc
queries on a SQL database, but that approach may not scale up trivially
once the quantity of data stored or write-load requires sharding your
database, and will usually require dedicated slaves for the purpose of
performing these queries (at which point, maybe you'd rather use a
system designed for analyzing large quantities of data, rather than
fighting your database).
o Adding a map-reduce layer makes it possible to perform data and/or
processing intensive operations in a reasonable amount of time. You
might use it for calculating suggested users in a social graph, or for
generating analytics reports. eg. Hadoop, and maybe Hive or HBase.
 Platform Layer (Services)
o Separating the platform and web application allow you to scale the
pieces independently. If you add a new API, you can add platform
servers without adding unnecessary capacity for your web application
tier.
o Adding a platform layer can be a way to reuse your infrastructure for
multiple products or interfaces (a web application, an API, an iPhone
app, etc) without writing too much redundant boilerplate code for
dealing with caches, databases, etc.

4
Key topics for designing a system

1. Concurrency
 Do you understand threads, deadlock, and starvation? Do you know how to
parallelize algorithms? Do you understand consistency and coherence?
2. Networking
 Do you roughly understand IPC and TCP/IP? Do you know the difference
between throughput and latency, and when each is the relevant factor?
3. Abstraction
 You should understand the systems you’re building upon. Do you know
roughly how an OS, file system, and database work? Do you know about the
various levels of caching in a modern OS?
4. Real-World Performance
 You should be familiar with the speed of everything your computer can do,
including the relative performance of RAM, disk, SSD and your network.
5. Estimation
 Estimation, especially in the form of a back-of-the-envelope calculation, is
important because it helps you narrow down the list of possible solutions to
only the ones that are feasible. Then you have only a few prototypes or micro-
benchmarks to write.

5
6. Availability & Reliability
 Are you thinking about how things can fail, especially in a distributed
environment? Do know how to design a system to cope with network failures?
Do you understand durability?

Web App System design considerations:

 Security (CORS)
 Using CDN
o A content delivery network (CDN) is a system of distributed servers
(network) that deliver webpages and other Web content to a user based on
the geographic locations of the user, the origin of the webpage and a
content delivery server.
o This service is effective in speeding the delivery of content of websites with
high traffic and websites that have global reach. The closer the CDN server is
to the user geographically, the faster the content will be delivered to the
user.
o CDNs also provide protection from large surges in traffic.
 Full Text Search
o Using Sphinx/Lucene/Solr - which achieve fast search responses because,
instead of searching the text directly, it searches an index instead.
 Offline support/Progressive enhancement
o Service Workers
 Web Workers
 Server Side rendering
 Asynchronous loading of assets (Lazy load items)
 Minimizing network requests (Http2 + bundling/sprites etc)
 Developer productivity/Tooling
 Accessibility

6
 Internationalization
 Responsive design
 Browser compatibility

Working Components of Front-end Architecture

 Code
o HTML5/WAI-ARIA
o CSS/Sass Code standards and organization
o Object-Oriented approach (how do objects break down and get put together)
o JS frameworks/organization/performance optimization techniques
o Asset Delivery - Front-end Ops
 Documentation
o Onboarding Docs
o Styleguide/Pattern Library
o Architecture Diagrams (code flow, tool chain)
 Testing
o Performance Testing
o Visual Regression
o Unit Testing
o End-to-End Testing
 Process
o Git Workflow
o Dependency Management (npm, Bundler, Bower)
o Build Systems (Grunt/Gulp)
o Deploy Process
o Continuous Integration (Travis CI, Jenkins)

You might also like