0% found this document useful (0 votes)
10 views

NO SQL IA-01_MICRO

Uploaded by

Anugala Aniketh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

NO SQL IA-01_MICRO

Uploaded by

Anugala Aniketh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

 Relational Database

o Relational databases have become such an embedded part of our computing


culture that it’s easy to take them for granted. It’s therefore useful to revisit the
benefits they provide.
o Persistent Data Management
 Relational databases provide a structured way to store and retrieve
large volumes of data, ensuring persistence beyond the limitations of
volatile memory (RAM). Unlike simple file systems, relational databases
allow e icient querying and manipulation of specific data elements.
o Concurrency Handling
 Enterprise applications often involve multiple users or systems
accessing the same dataset concurrently. Relational databases use
transactions to manage this complexity, ensuring data consistency and
preventing conflicts (e.g., double bookings).
o Integration Across Applications
 Relational databases serve as a central hub for enterprise data,
enabling multiple applications to share and access common datasets.
This fosters collaboration between disparate systems.
o A (Mostly) Standard Model
 The relational model and SQL provide a standardized framework that
developers and database administrators can apply across projects,
despite minor variations in vendor implementations.
 Explain Impedance mismatch taking a suitable example.
o Impedance mismatch refers to the disconnect between the object-oriented
programming (OOP) paradigm used in application development and the
relational paradigm used in traditional databases. The issue arises because
object-oriented applications model data as objects with complex hierarchies,
while relational databases store data in structured, tabular formats. This
di erence leads to challenges in mapping objects to relational tables and vice
versa.
o Core Challenges of Impedance Mismatch
 Schema Mapping: Objects in OOP languages like Java or Python have
attributes and methods, while relational databases organize data into
rows and columns. Mapping these two structures is non-trivial.
 Inheritance vs. Flat Structures: Object hierarchies (e.g., parent-child
classes) do not directly map to relational tables, which are inherently
flat.
 Data Retrieval: Relational databases use SQL, which retrieves data in
sets, whereas OOP languages handle individual objects.
 Transaction Handling: OOP frameworks manage objects in memory,
while relational databases enforce consistency through transactions,
requiring synchronization.
o Example
o Scenario: A company uses a relational database to store employee details,
including engineers and managers. The database has a single table:
o
o ID Name Role Salary
o 101 Alice Engineer 80,000
o 102 Bob Manager 120,00
 Explain the differences between NOSQL and traditional databases. Explain the
need of NOSQL. Describe how NOSQL is different from traditional databases

NoSQL Databases Traditional Databases (SQL)


Schema-less, flexible data models (e.g., key- Fixed schema with rows and tables.
value, document, columnar, graph).
Horizontally scalable (adds servers to
Vertically scalable (improves server
scale).
hardware).
No standard query language; database- Standardized SQL for querying data.
specific APIs.
Fixed schema requiring alterations for
Dynamic, allows schema evolution without
changes.
downtime.
Often favors eventual consistency (CAP Strong consistency (ACID compliance).
theorem).
Real-time analytics, IoT, unstructured Transactional systems, financial
data, social media. applications.
Limited or database-specific transaction Strong ACID-compliant transactions.
support.
Optimized for specific workloads (e.g., General-purpose performance for a variety
reads, writes). of operations.

Weak support; relationships are typically Strong support with JOIN operations.
handled by the application.
MySQL, PostgreSQL, Oracle, SQL Server.
MongoDB, Cassandra, DynamoDB,
CouchDB.
Module -02

 Single Server Model


o Simplest distribution option for NoSQL data store and access is Single Server
Distribution (SSD) of an application
o A graph database processes the relationships between nodes at a server. The
SSD model suits well for graph DBs
o Aggregates of datasets may be key-value, column-family or BigTable data stores
which require sequential processing.
o These data stores also use the SSD model. An application executes the data
sequentially on a single server.
o Process and datasets distribute to a single server which runs the application
o No Disturbuted System , Can’t Perform Data Replication and Partation
o Removes Complexity, Low Cost , Simple Maintaince, Vertical Scalability
o Application Require Dataset distributed per instance------- Single server
Running in app
 master-slave replication
o Master directs the slaves.
o Slave nodes data replicate on multiple slave servers in Master Slave
Distribution (MSD) model
o When a process updates the master, it updates the slaves also. A process uses
the slaves for read operations
o Processing performance improves when process runs large datasets
distributed onto the slave nodes
o Master-Slave Replication Processing performance decreases due to replication
in MSD distribution model.
o Resilience for read operations is high, which means if in case data is not
available from a slave node, then it becomes available from the replicated
nodes.
o Master uses the distinct write and read paths.
o Complexity Cluster-based processing has greater complexity than the other
architectures. Consistency can also be a ected in case of problem of
significant time taken for updating
o Master-slave replication provides greater scalability for read operations.
Replication provides resilience during the read. Master does not provide
resilience for writes.

 Peer-to-Peer Distribution Model
o Peer-to-Peer distribution (PPD) model and replication show the following
characteristics:
 All replication nodes accept read request and send the responses.
 All replicas function equally.
 Node failures do not cause loss of write capability, as other replicated
node responds.
o Cassandra adopts the PPD model
o The data distributes among all the nodes in a cluster.
o Performance can further be enhanced by adding the nodes. Since nodes read
and write both, a replicated node also has updated data. Therefore, the biggest
advantage in the model is consistency. When a write is on di erent nodes, then
write inconsistency occurs
o Peer-to-peer replication provides resilience for read and writes both.


 Sharding
o Itis storing the di erent parts of data onto di erent sets of data nodes, clusters
or servers. For example, university students huge database, on sharding divides
in databases, called shards. Each shard may correspond to a database for an
individual course and year. Each shard stores at di erent nodes or servers
o It Distributes di erent data across multiple servers so each server acts as the
single source for a subset of data
o Its Follows SN Architecture,Horizontal Scalability, Multiple server, Performance
Improves in SN Architecture , Large Data, Makes large data into small data in
separate Servers, Splitting Data Reduces Load on single server improves read
write performance
o Data Partitioning
 Replication
o Replication is the process of copying and maintaining the same data on
multiple servers to improve availability, fault tolerance, and read performance.
o A primary (or master) server handles all write operations and replicates data to
secondary (or replica) servers.
o Secondary servers can handle read requests, while the primary focuses on
writes.
 Advantages of Replication
o High Availability: If the primary server fails, a replica can take over (failover).
o Load Balancing: Read operations can be distributed among replicas, reducing
the load on the primary server.
o Data Redundancy: Reduces the risk of data loss due to hardware failure.
 Draw Backs
o Consistency: Ensuring data is synchronized across replicas in real-time can be
challenging, especially in systems with eventual consistency.
o Write Bottleneck: Since only the primary handles writes, the system may face
bottlenecks under heavy write loads.
 Combining Sharding and Replication
o Combining sharding and replication provides the best of both worlds:
scalability through sharding and reliability through replication.
o Each shard is replicated across multiple servers.
o Advantages
 Scalability: Sharding allows the database to scale horizontally by
distributing data across shards.
 High Availability: Even if a server within a shard fails, replicas ensure
the shard remains accessible.
 Load Distribution: Sharding distributes write operations across shards,
while replication distributes read operations across replicas.
 Fault Tolerance: Replicas ensure data redundancy, reducing the risk of
data loss within a shard.
 Explain Update Consistency with respect to NOSQL taking suitable examples
o update consistency refers to how changes to data are handled and when they
become visible to users or applications. Unlike traditional relational databases,
which often prioritize strong consistency (ensuring updates are immediately
visible to all users), many NoSQL systems prioritize availability and partition
tolerance (as per the CAP theorem) over immediate consistency, opting for
eventual consistency instead.
o write-write conflict: two people updating the same data item at the same time.
o the server will serialize them—decide to apply one, then the other. Let’s
assume it uses alphabetical order and picks Martin’s update first, then
Pramod’s.
o Without any concurrency control, Martin’s update would be applied and
immediately overwritten by Pramod’s. In this case Martin’s is a lost update
o Approaches for maintaining consistency in the face of concurrency are often
described as pessimistic or optimistic.
o A pessimistic approach works by preventing conflicts from occurring; an
optimistic approach lets conflicts occur, but detects them and takes action to
sort them out.
o optimistic approach is a conditional update where any client that does an
update tests the value just before updating it to see if it’s changed since his last
read.
o In this case, Martin’s update would succeed but Pramod’s would fail
o Types of Update Consistency in NoSQL
 Strong Consistency:Updates are immediately visible to all users and
systems.

 Example: Banking Application
 Eventual Consistency:Updates propagate to all nodes over time,
meaning some nodes may briefly hold stale data.
 Example: E-Commerce Application
 Tunable Consistency:Some NoSQL databases allow the user to choose
the level of consistency.
 Example: Social Media Application
 Read Consistency
o Read consistency refers to how up-to-date and accurate data is when it is read
from a NoSQL database, especially in distributed systems. NoSQL databases
o er di erent levels of read consistency based on how they balance
performance, availability, and consistency as per the CAP theorem.
o Ex: The danger of inconsistency is that Martin adds a line item to his order,
Pramod then reads the line items and shipping charge, and then Martin updates
the shipping charge. This is an inconsistent read or readwrite conflict
o Pramod has done a read in the middle of Martin’s write.
o We refer to this type of consistency as logical consistency: ensuring that
di erent data items make sense together. To avoid a logically inconsistent read-
write conflict, relational databases support the notion of transactions.
o Providing Martin wraps his two writes in a transaction, the system guarantees
that Pramod will either read both data items before the update or both after the
update.
o any update that a ects multiple aggregates leaves open a time when clients
could perform an inconsistent read. The length of time an inconsistency is
present is called the inconsistency window

You might also like