0% found this document useful (0 votes)
38 views

System Design

The document discusses system design frameworks, principles, patterns and technologies. It covers the 4 steps for effective system design interviews: understanding the problem, proposing a high-level design, doing a deep dive design, and wrapping up. Key acronyms and concepts covered include CAP theorem, SOLID principles, design patterns like builder and proxy, load balancing algorithms, caching, message queues, and scaling databases. Technologies discussed are Spring, Java, and SQL.

Uploaded by

klebermagno
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

System Design

The document discusses system design frameworks, principles, patterns and technologies. It covers the 4 steps for effective system design interviews: understanding the problem, proposing a high-level design, doing a deep dive design, and wrapping up. Key acronyms and concepts covered include CAP theorem, SOLID principles, design patterns like builder and proxy, load balancing algorithms, caching, message queues, and scaling databases. Technologies discussed are Spring, Java, and SQL.

Uploaded by

klebermagno
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Summary

System Design............................................................................................................................. 3
Framework.....................................................................................................................................3
4 steps for effective system design interview...........................................................................3
Step 1 - Understand the problem and establish design scope................................................ 3
Step 2 - Propose high-level design and get buy-in.................................................................. 3
System Design Acronyms.......................................................................................................... 4
CAP theorem........................................................................................................................... 4
PACELC Theorem................................................................................................................... 5
BASE....................................................................................................................................... 5
Basically available................................................................................................................... 5
Soft state..................................................................................................................................5
Eventual consistency............................................................................................................... 5
SOLID...................................................................................................................................... 5
ACID........................................................................................................................................ 5
KISS.........................................................................................................................................6
Design pattern..............................................................................................................................7
Creational................................................................................................................................ 8
Builder................................................................................................................................8
Prototype............................................................................................................................8
Abstract Factory.................................................................................................................8
Singleton............................................................................................................................ 8
Factory Method.................................................................................................................. 8
Structural................................................................................................................................. 8
Adapter.............................................................................................................................. 8
Bridge.................................................................................................................................8
Composite..........................................................................................................................8
Decorator........................................................................................................................... 8
Facade............................................................................................................................... 8
Flyheight............................................................................................................................ 8
Proxy..................................................................................................................................8
Behavioral................................................................................................................................8
Chain of responsibility........................................................................................................8
Command.......................................................................................................................... 8
Interpreter.......................................................................................................................... 8
Decorator........................................................................................................................... 9
Mediator............................................................................................................................. 9
Memetor.............................................................................................................................9
Observer............................................................................................................................ 9
State...................................................................................................................................9
Strategy..............................................................................................................................9
Visitor................................................................................................................................. 9
Template Method............................................................................................................... 9
Load Balancing.......................................................................................................................... 10
https://blog.bytebytego.com/p/ep47-common-load-balancing-algorithms..............................11
Cache.................................................................................................................................... 13
Cache tear............................................................................................................................. 13
Content Delivery network CDN.................................................................................................13
Stateless web tier...................................................................................................................... 14
Stateful web tier..................................................................................................................... 14
Message queue.......................................................................................................................... 15
Event Driven.......................................................................................................................... 17
Inventory Management and Concurrency Control................................................................. 18
Logging, metrics, automation.................................................................................................. 20
Scale System..............................................................................................................................21
Scale Database..................................................................................................................... 21
Dependency Injection............................................................................................................ 22
Technologies Theoric................................................................................................................23
Spring......................................................................................................................................... 24
Inversion of Control (IoC) and Dependency Injection (DI):.................................................... 24
"injection types.".................................................................................................................... 25
Java.............................................................................................................................................27
Database System....................................................................................................................... 28
Types of SQL Statements......................................................................................................28
DML................................................................................................................................. 28
DDL........................................................................................................................................29
System Design

Framework

4 steps for effective system design interview.

Step 1 - Understand the problem and establish design scope


Clarification questions:
● What specific features are we going to build?
● How many users does the product have?
● How fast does the company anticipate to scale up?
● What are the anticipated scales in 3 months, 6 months, and a year?
● What is the company’s technology stack?
● What existing services you might leverage to simplify the design?

Step 2 - Propose high-level design and get buy-in

Step 3 - Design deep dive

Step 4 - Wrap up
System Design Acronyms

CAP theorem
Trade of between consistency, availability and partition tolerance in distributed systems.

● Consistency: consistency means all clients see the same data at the same time no
matter which node they connect to.
● Availability: availability means any client which requests data gets a response even if
some of the nodes are down.
● Partition Tolerance: a partition indicates a communication break between two nodes.
Partition tolerance means the system continues to operate despite network partitions.

“2 of 3”: CAP theorem states that a distributed system can't provide more than two of these
three guarantees simultaneously.

Example: In a bank system you have 2 ATMs in case of network partition ( one is down) in
Availability guarantee that you can withdraw in both But if your connection you will lose
consistency of data. Each will have an inconsistent balance. In Consistency you will not
provide availability in an unconnected device.
PACELC Theorem

Extension to CAP. It states that in case of network partitioning P in a distributed computer


system, one has to choose between availability A and consistency C but else E, even when
the system is running normally in the absence of partitions, one has to choose between latency
L and loss consistency C.
Even without partition you have to choose between latency and loss consistency.

BASE

Basically available

Soft state

Eventual consistency
TODO

SOLID
TODO

ACID
TODO
KISS
TODO
Design pattern
Creational

Builder

Prototype

Abstract Factory

Singleton

Factory Method

Structural

Adapter

Bridge

Composite

Decorator

Facade

Flyheight

Proxy

Behavioral

Chain of responsibility

Command

Interpreter
Decorator

Mediator

Memetor

Observer

State

Strategy

Visitor

Template Method
Load Balancing
Distribute incoming traffic among web servers that are defined in a load balance set.
The servers are unreachable directly by clients anymore.

Common GCP load balance solutions:


● Google Cloud Load Balancing: GCP offers a variety of load balancing options that
work at different layers of the OSI model:
● Global HTTP(S) Load Balancer (Layer 7): Suitable for HTTP/HTTPS traffic, offering
features like URL maps and SSL termination.
● TCP/SSL Proxy Load Balancer (Layer 4): Useful for TCP traffic where you need to use
an IP address per region or SSL termination.
● Internal Load Balancer: For load balancing internal traffic within GCP.
● Google Kubernetes Engine (GKE): If your Java application is containerized and
deployed using Kubernetes on GKE, you can use the built-in load balancing features of
Kubernetes, which integrate with GCP's load balancers.
● Google Cloud Armor: While not a load balancer itself, Google Cloud Armor works with
the Google Cloud Load Balancing service to provide additional security features such as
DDoS protection and web application firewall (WAF) capabilities.
● Zuul: This is an edge service that provides dynamic routing, monitoring, resiliency, and
security. It can be used in conjunction with GCP's load balancers to manage traffic to
your Java applications.
● HAProxy or NGINX: These are popular open-source software load balancers that can
be deployed on Compute Engine instances to balance traffic before it reaches your
application servers.
● Traffic Director: For service mesh scenarios, Traffic Director is GCP's fully managed
service mesh control plane that provides traffic control and load balancing for
microservices.
● Istio: As a service mesh, Istio provides advanced traffic management capabilities like
load balancing, retries, and circuit breaking. It can be used on GKE for fine-grained
control over traffic.

When choosing tools for load balancing, you should consider the specific requirements of your
application, such as whether you need global or regional load balancing, the type of traffic you
need to manage (HTTP, HTTPS, TCP, UDP), and whether you need additional features like SSL
termination, health checks, or integration with a content delivery network (CDN).
https://blog.bytebytego.com/p/ep47-common-load-balancing-algor
ithms
Static Algorithms
● Round robin
○ The client requests are sent to different service instances in sequential order. The
services are usually required to be stateless.
● Sticky round-robin
○ This is an improvement of the round-robin algorithm. If Alice’s first request goes
to service A, the following requests go to service A as well.
● Weighted round-robin
○ The admin can specify the weight for each service. The ones with a higher weight
handle more requests than others.
● Hash
○ This algorithm applies a hash function on the incoming requests’ IP or URL. The
requests are routed to relevant instances based on the hash function result.
Dynamic Algorithms
● Least connections
○ A new request is sent to the service instance with the least concurrent
connections.
● Least response time
○ A new request is sent to the service instance with the fastest response time.
Cache
Is a temporary storage area that stores the result of expensive responses or frequently
accessed data in memory so that subsequent requests are served more quickly.

Cache tear
Is a temporary storage layer, much faster than the database.
Benefits:
● Better system performance,
● Ability to reduce database workloads,
● Ability to scale the cache tier independently
Examples:
cache.set('mykey', 'content', 3600)
cache.get('mykey')
Discantages:
● Inconsistency: data-modification operations on data store and cache are not in a single
transaction.
● Single cache server can be a potential single point of failure. SPOF.
● Eviction Policy: once the cache is full, any requests to add items might cause existing
item to be removed.
○ Least-recently-used LRU
○ Lest-frequently-used LFU
○ Fist in first out FIFO

Content Delivery network CDN


Is a network of geographically dispersed server used to deliver static content like, images,
video, css, javascript, etc..
Stateless web tier
Stateless keep no information/state from request.

Stateful web tier


The server remembers client data/state from one request to the next.
Message queue
Is a durable component, stored in memory, that supports asynchronous communication. It
serves as a buffer and distributes asynchronous requests. The basic architecture of a message
queue is simple.
Input services:
● Producer/publisher: create messages and publish them to a message queue.
● Consumers/subscribers: connect to the queue, and perform actions defined by the
message.
Decoupling makes the message queue a preferred architecture for building a scalable and
reliable application.
● Producer can post message when a consumer is unavailable. and
● the consumer can read the message when the producer in unavailable.

Advantages:
● Asynchronous processing, Alow different parts of the system to communicate and
process tasks at their own pace.
● Decoupling: services can operate independently.
● Scalability: easier to scale different parts of the system as demand changes.
● Reliability: provide mechanisms to ensure messages are not lost and can be processed
even in case of temporary failure.
● Load Balance: Efficiency distribution tasks across workers.
● Ordering and consistency:

Disadvantages
● Complexity: add complexity to maintain the system.
● Latency: can introduce latency in the message processing
● Monitoring and management: requires monitoring to ensure smooth operation.
● Dependency and integration: requires integration with existing systems, which can be
challenging.
● Cost: depending on the solution it can add the operational costs.

Comparison of Popular Message Queues

Kafka
Pros:
● High throughput and scalability.
● Strong durability guarantees.
● Suitable for event streaming and log aggregation.
Cons:
● Complex setup and management.
● Steeper learning curve.
RabbitMQ
Pros:
● Mature, with strong support for various messaging protocols.
● Easier to set up and manage than Kafka.
● Good for lightweight and traditional message queue needs.
Cons:
● Lower throughput compared to Kafka.
● Not ideal for log aggregation or event streaming at a very large scale.

GCP Pub/Sub

Pros:
● Fully managed service, reducing operational overhead.
● Easy integration with other Google Cloud services.
● Good scalability and reliability.
Cons:
● Limited to the Google Cloud Platform ecosystem.
● Potentially higher costs for high-volume usage.
● Less control compared to self-hosted solutions.
Other Options

Amazon SQS: A fully managed queue service in AWS, easy to use but with limited message
size.
Azure Service Bus: Microsoft's solution, integrates well with other Azure services, offering
features like message sessions and dead-letter queues.
ActiveMQ: A reliable and mature open-source message broker with support for multiple
protocols.

Push and Pull Subscribers


Push Subscribers:
Queue system automatically sends messages to the subscribers as soon as they become
available in the queue.
Advantages:
● Real-time: Subscribers receive messages immediately, which is useful for time-sensitive
data.
● Less Overhead: Subscribers don't have to continuously check for new messages.
Disadvantages:
● Risk of Overloading: If the subscriber can't process messages quickly enough, it can
become overwhelmed.
● Dependency on Subscriber Availability: Requires that the subscriber is always ready to
receive and process messages.
Pull Subscribers:
How it Works: Subscribers periodically check the queue for new messages and retrieve
them.
Advantages:
● Control: Subscribers have more control over when they receive messages, which can be
beneficial for handling workload spikes.
● Flexibility: Easier to implement in systems where subscriber readiness cannot be
guaranteed.
Disadvantages:
● Latency: There can be a delay between the message arrival in the queue and its
processing.
● Increased Overhead: Constant polling for new messages can lead to inefficiency,
especially in low-volume scenarios.

Types of Publishers

Synchronous vs. Asynchronous Publishers:


● Synchronous: The publisher waits for an acknowledgment from the message queue
system before continuing. This ensures that the message has been successfully queued
but can slow down the publisher.
● Asynchronous: The publisher sends a message and continues without waiting for an
acknowledgment. This is faster but risks message loss if the queue system fails to
receive the message.
Batch vs. Single-Message Publishers:
● Batch: The publisher sends a batch of messages at once. This can be more efficient for
the message system but requires more memory and processing power on the
publisher's side.
● Single-Message: Each message is sent individually, simpler to implement but potentially
less efficient.
Reliability and Durability Considerations:
Some publishers might implement mechanisms to ensure message delivery even in the event of
failures (e.g., retry logic, transactional publishing).
Event-Driven Publishers:
In some architectures, particularly with event-driven designs, publishers might trigger messages
based on events or state changes, which could be automated or result from user actions.

Event Driven
An event-driven system is a design paradigm where the flow of the program is determined by
events such as user actions, sensor outputs, or messages from other programs. This approach
contrasts with a more traditional, procedural programming model, and it is particularly
well-suited to environments where certain conditions or changes in state dictate the execution of
various functionalities.
Inventory Management and Concurrency Control
● Optimistic Locking: When a customer attempts to purchase an item, the system
checks if the item is available. If yes, the purchase proceeds, and the inventory is
updated. This approach assumes conflicts are rare but checks to ensure data integrity.
It’s best for scenarios where write conflicts are not frequent.
● Pessimistic Locking: Lock the item for a short period when a user adds it to their cart.
This approach is more straightforward but can lead to scalability issues as it locks
resources and might impact user experience.
● Database Transactions: Use database-level transactions to ensure that the inventory
update operations are atomic. This helps in maintaining consistency even in the event of
system failures.

Cart Abandonment and Timeouts


Cart Expiration Policy: Implement a timeout for items in the cart. If the user does not proceed
to checkout within this time, the item is released back into the inventory. This duration can be
determined based on user behavior analysis.
Reminder Notifications: Send reminders to users about items in their carts as the expiration
time approaches. This encourages them to complete the purchase or release the item.
3. Scalability and Performance
Queueing Systems: For high-traffic websites, use a queuing system like RabbitMQ or Kafka to
handle purchase requests. This ensures that requests are processed in order and can help
manage load on the system.
Caching: Implement caching strategies for inventory data to reduce database load. However,
ensure that cache invalidation occurs promptly when inventory changes.
4. Handling Edge Cases
Backorder or Waitlist: If an item is popular and runs out of stock, consider offering backorders
or a waitlist. This helps capture customer interest without promising immediate availability.
Real-time Inventory Updates: Use web sockets or similar technologies to update inventory
information in real-time on the user interface. This minimizes the chances of a user trying to
purchase an item that has just gone out of stock.
https://blog.bytebytego.com/p/why-do-we-need-a-message-queue
Logging, metrics, automation
Datadog
Logging: Monitoring error logs is important because it helps to identify errors and problems in
the system. You can monitor error logs at per server level or use tools to aggregate them to a
centralized service for easy search and viewing.
Metrics: Collecting different types of metrics help us to gain business insights and understand
the health status of the system. Some of the following metrics are useful:
• Host level metrics: CPU, Memory, disk I/O, etc.
• Aggregated level metrics: for example, the performance of the entire database tier, cache tier,
etc.
• Key business metrics: daily active users, retention, revenue, etc.
Automation: When a system gets big and complex, we need to build or leverage automation
tools to improve productivity. Continuous integration is a good practice, in which each code
check-in is verified through automation, allowing teams to detect problems early. Besides,
automating your build, test, deploy process, etc. could improve developer productivity
significantly.
Scale System
● Keep web tier stateless
● Build redundancy at every tier
● Cache data as much as you can
● Support multiple data centers (zones)
● Host static assets in CDN
● Scale your data tier by sharding
● Split tiers into individual services
● Monitor your system and use automation tools.

Scale Database
● Partition: We can split a table to gain performance. Example plit
● Denormalization: Join can take too much time and use resources. We can denormalize
our database to gain performance.
● Index
● Vertical Scale
● Horizontal scale

Dependency Injection
Big Data
Clean Code
Dependency injection
Java Dependency Injection

Containerization & Containers Orchestration


Design patterns
Microservice Architecture Pattern
Software Quality
Domain-Driven Design (DDD)
Java Microservice Infrastructure
APIs and Integration
CI/CD
Java Bootstrapping
Java Caching
Java Cloud
Java Messaging
Java NoSQL
Java Observability
Java Persistence
Java SQL
Java Security
Technologies Theoric
Spring
https://docs.spring.io/spring-framework/reference/overview.html

Java Spring Core is a fundamental part of the Spring Framework, which is a comprehensive
framework for building Java applications, especially for enterprise-level development. It provides
a wide range of features that simplify Java development and promotes good software design
practices. Here's an overview of its key concepts and internal organization:

Inversion of Control (IoC) and Dependency Injection (DI):


● Inversion of Control (IoC): This principle is about shifting control of the object creation,
configuration, and lifecycle to the framework rather than the application code. It helps in
decoupling the code dependencies.
● Dependency Injection (DI): A form of IoC where dependencies are injected into objects
by the framework. There are various types of DI - constructor injection, setter injection,
and field injection.
Bean Management:
● Beans: The objects that form the backbone of your application and are managed by the
Spring IoC container are called beans. A bean is an object that is instantiated,
assembled, and otherwise managed by a Spring IoC container.
● Bean Lifecycle: Beans go through various stages such as instantiation, populating
properties, and initialization. Spring provides ways to hook into these lifecycle events.
ApplicationContext and BeanFactory:
● ApplicationContext: It's a more advanced form of the BeanFactory. It provides
additional features such as easier integration with Spring’s AOP features, message
resource handling, and event propagation.
● BeanFactory: It's the simplest container, providing basic support for DI and defined by
the org.springframework.beans.factory.BeanFactory interface.
AOP (Aspect-Oriented Programming):
Allows defining cross-cutting concerns (like logging, transaction management) that are separate
from the business logic.
Spring AOP module provides AOP support through proxying mechanism.
Spring Configuration:
● XML Configuration: Initially, Spring relied heavily on XML files for bean configuration.
● Annotation-Based Configuration: This approach allows for more concise and
expressive configuration embedded directly in the Java code. Common annotations
include @Component, @Service, @Repository, @Autowired, etc.
● Java-Based Configuration: Enables you to write configuration as Java code using
@Configuration and @Bean annotations.
Event Handling:
Spring's event handling is an application of the Observer pattern, where beans can publish and
listen to application events.
Resource Abstraction:
Spring provides a Resource abstraction (org.springframework.core.io.Resource) for low-level
access to resources, such as files and classpath resources.
Internationalization (i18n):
Supports internationalization to provide locale-specific text messages.
Data Access / Integration:
Integrates seamlessly with JDBC, Hibernate, JPA, etc., to provide a consistent data access
experience.
Transaction Management:
Offers a consistent programming model for both programmatic and declarative transaction
management.

"injection types."
Here's a brief overview of each injection type:
Constructor Injection:
● Dependencies are provided through a class constructor.
● This is the recommended way by Spring as it allows you to implement immutable objects
because the dependency cannot be altered after the object has been created.
● It's especially suitable when the dependency is mandatory and the object cannot
function without it.
● You can use the @Autowired annotation on the constructor to auto-wire the
dependencies, although as of Spring 4.3, if the class has only one constructor, it's not
necessary to annotate it with @Autowired.

Setter Injection:
● Dependencies are provided through setter methods in the class.
● This is a more traditional way of dependency injection and can be used for optional
dependencies that can be changed or set later in the bean's lifecycle.
● It also allows for the possibility of reconfiguring the bean by calling the setter method
again.
● Setter injection is specified by using the @Autowired annotation on the setter method.

Field Injection:
● Dependencies are injected directly into the fields of a class.
● This is the least preferred method for several reasons, including the difficulty of unit
testing, the inability to create immutable fields, and the interference with the
encapsulation principle.
● To use field injection, you annotate the field with @Autowired.

When to Use Each One:


Constructor Injection: Use this when you need to enforce the dependency requirement. If the
object cannot exist without the dependency, constructor injection is the most appropriate. This
method is also preferred for required and immutable dependencies.

Setter Injection: Choose this for optional dependencies that can be altered after the bean has
been constructed. It's also a good choice when there are too many dependencies, leading to a
constructor with a large number of parameters.

Field Injection: While it's convenient for simple applications, it's generally discouraged for the
reasons mentioned above. However, it can still be used in certain scenarios like tests or when
dealing with frameworks that require parameterless constructors.
Java
Test

Smoke Testing
This is done after API development is complete. Simply validate if the APIs are working and
nothing breaks.
Functional Testing
This creates a test plan based on the functional requirements and compares the results with the
expected results.
Integration Testing
This test combines several API calls to perform end-to-end tests. The intra-service
communications and data transmissions are tested.
Regression Testing
This test ensures that bug fixes or new features shouldn’t break the existing behaviors of APIs.
Load Testing
This tests applications’ performance by simulating different loads. Then we can calculate the
capacity of the application.
Stress Testing
We deliberately create high loads to the APIs and test if the APIs are able to function normally.
Security Testing
This tests the APIs against all possible external threats.
UI Testing
This tests the UI interactions with the APIs to make sure the data can be displayed properly.
Fuzz Testing
This injects invalid or unexpected input data into the API and tries to crash the API. In this way,
it identifies the API vulnerabilities.
Database System

Link

Types of SQL Statements

● Data Definition Language (DDL) - Defines the structure of the database objects.
● Data Manipulation Language (DML) - Deals with the retrieval and manipulation of the
data stored in tables.
● Transaction Control Language (TCL) - Deals with the transaction within the database.
● Data Control Language (DCL) - Deals with the permissions and privileges of the
objects.

Join
Query
Index

DML
Select
SELECT d.id, d.name FROM Departments d
JOIN Employees e ON d.id = e.id_departement
GROUP BY d.id, d.name
HAVING Count(e.id) >2

Insert

INSERT INTO TABLE_NAME (column1, column2, column3,...columnN)]


VALUES (value1, value2, value3,...valueN);
INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);

Update

UPDATE table_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];

Delete

DELETE FROM table_name


WHERE [condition];

DDL

CREATE TABLE table_name(


column1 datatype [NOT NULL] [IDENTITY(1,1)],
column2 datatype,
column3 datatype,
.....
columnN datatype,
PRIMARY KEY( one or more columns )
);

Alter table

/* ADD COLUMN */
ALTER TABLE table_name ADD column_name datatype;

View

/* Using VIEWs */
CREATE VIEW view_name AS
SELECT column1, column2...
FROM table_name
WHERE [condition];
DROP VIEW view_name;

INDEX Constraint
Indexes can be used to speed up data retrieval. An index helps to speed up SELECT queries
and WHERE clauses, but it slows down data input, with the UPDATE and the INSERT
statements.

/* Creates an index on a table. Duplicate values are allowed: */


CREATE INDEX idx_lastname
ON Persons (LastName);
/* Creates a unique index on a table. Duplicate values are not allowed: */
CREATE UNIQUE INDEX idx_lastname
ON Persons (LastName);

/* Create an index on a combination of columns */


CREATE INDEX idx_fname
ON Persons (LastName, FirstName);

/* DROP INDEX Statement */


DROP INDEX Persons.idx_lastname;

You might also like