0% found this document useful (0 votes)

11 views

Nosql Final

Uploaded by

vvchandrahasreddy7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Nosql Final

Uploaded by

vvchandrahasreddy7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

MODULE-1:

1. What is NOSQL? Explain briefly about aggregate data model with a neat diagram. Considering example of
Relational and Aggregate 8M/10M

ud
lo
Figure 2.3. An aggregate data model

1. What is NoSQL?
C
NoSQL databases are non-relational databases designed to store, retrieve, and manage unstructured or semi-structured
data, which doesn’t fit neatly into traditional rows and columns like in relational databases. NoSQL databases are highly
flexible, allowing for different data models like key-value, document, column-family, and graph. They are ideal for big
data and real-time applications, offering scalability and speed.
tu

2. Aggregate Data Model in NoSQL:

The aggregate data model in NoSQL groups related data together as a single unit, called an "aggregate." This means
V

that, instead of splitting data across multiple tables (like in relational databases), related data is stored together.
Aggregates simplify data management, especially in large, distributed systems, making it easy to retrieve and update
data without complex joins.

In NoSQL, an aggregate could include all data related to a customer, such as their orders and addresses, within a single
record. This reduces the number of interactions needed to access the data and makes handling data across multiple
servers (sharding) easier.

ABHISHEK K N
Relational vs. Aggregate Model Example:

• Relational Model: Data is spread across multiple tables, with separate tables for customers, orders, and items.
These tables are linked by keys, so accessing all data about a customer’s orders requires joins.
• Aggregate Model (NoSQL): All related data is embedded in one document. For example, a customer document
may contain their orders, items, and addresses directly, simplifying data retrieval.

2. Briefly explain the value of Relational database

Relational databases have become an embedded part of our computing culture. The benefits they provide include:

ud
1. Getting at Persistent Data: Databases allow the storage of large amounts of persistent data, offering flexibility
over file systems by enabling applications to access small bits of information quickly and easily.
2. Concurrency: Relational databases help manage data access by multiple users through transactions, which aid
in preventing issues like double booking. Transactions also support error handling by allowing changes to be
rolled back if an error occurs.
3. Integration: Databases enable inter-application collaboration by allowing multiple applications to store and
lo
access data in a single database, making shared data accessible and manageable.
4. A (Mostly) Standard Model: Relational databases provide a mostly standard model, allowing developers to
apply knowledge across different projects. Despite some differences in SQL dialects, core mechanisms like
transactions operate similarly across databases.
C
3. Explain briefly about Impedance mismatch, with a neat diagram
tu
V

Impedance Mismatch refers to the difficulties that arise from the differences between the relational model used in
databases and the in-memory data structures used in application programming. Here are the key points:

1. Definition: Impedance mismatch is the discrepancy between the relational model (tables and rows) and in-
memory data structures (objects) in programming.
2. Relational Model: Data is organized into tables (relations) and rows (tuples), where each tuple consists of
simple name-value pairs.

ABHISHEK K N
3. Limitations: Relational tuples cannot contain complex structures, such as nested records or lists, which are
common in in-memory data structures.
4. Translation Requirement: Developers must translate rich in-memory structures into a relational format for
storage, leading to added complexity.
5. Historical Context: In the 1990s, object-oriented databases emerged as a potential solution but ultimately faded
as relational databases remained dominant.
6. ORM Solutions: Object-Relational Mapping (ORM) frameworks like Hibernate and iBATIS help manage this
mismatch but can introduce their own performance issues.
7. Ongoing Challenge: Despite advancements, impedance mismatch continues to be a source of frustration for
developers.

ud
4. Write a short note on:

a. Consequences of Aggregate orientation

b. Key-value data model
c. Document data model

a. Consequences of Aggregate Orientation

•
lo
Lack of Distinction: Relational databases do not differentiate between aggregate relationships and other
relationships.
• Aggregate-Ignorant: They cannot optimize storage or distribution based on aggregate structures.
C
• Modeling Challenges: Existing modeling techniques often lack consistent semantics for defining aggregates.
• Cluster Efficiency: Aggregate orientation enhances data manipulation on clusters by minimizing node queries.
• Transaction Limitations: Supports atomic operations on individual aggregates, requiring application
tu

management for multi-aggregate transactions.

b. Key-Value Data Model

• Simple Structure: Data is stored as unique key-value pairs.

• Opaque Aggregates: The internal structure of the aggregate is not recognized by the database.
V

• Flexibility: Allows storing various types of data without strict structure limits.
• Key-Based Access: Primarily accessed through lookups based on unique keys.

c. Document Data Model

• Structured Storage: Stores data in formats like JSON or XML.

• Internal Structure Recognition: Can understand and utilize the structure of aggregates.
• Complex Queries: Supports querying based on fields within the document, not just keys.
• Partial Retrieval: Allows retrieval of specific parts of an aggregate.
• Indexing: Can create indexes based on document contents for efficient access.

ABHISHEK K N
5. Explain about Graph database, with neat diagram

ud
Graph Databases

• Definition: Graph databases store data as a collection of nodes (entities) and edges (relationships) that connect
lo
them. They are designed to handle complex interconnections between data points efficiently.
• Structure:
o Nodes: Represent entities (e.g., people, products).
o Edges: Represent relationships between nodes (e.g., "likes," "purchases").
o Attributes: Nodes and edges can have properties or attributes associated with them.
C
• Motivation: Graph databases emerged as a response to the limitations of relational databases, particularly in
managing highly connected data without incurring significant performance costs.
• Use Cases: Ideal for scenarios with complex relationships such as:
tu

o Social networks
o Recommendation systems
o Fraud detection
o Knowledge graphs
• Querying:
V

o Graph databases excel in queries that navigate through relationships, allowing for efficient traversal.
o Queries often involve starting from a node and exploring its connections (e.g., "Find all products liked
by friends of a user").
• Performance:
o Traversal operations are optimized in graph databases, making them cheaper than relational databases,
which often require costly joins.
o The architecture is typically designed for high read performance, especially for connected data.
• ACID Transactions:
o Graph databases support ACID transactions but may require covering multiple nodes and edges to
maintain data consistency.
ABHISHEK K N
6. What are Schemaless database? Explain

Schemaless Databases

Definition: Schemaless databases are a type of NoSQL database that do not require a predefined schema before storing
data. This allows for greater flexibility in data storage and structure.

Key Characteristics

1. Flexibility:
o Users can store data in any format without needing to define a rigid structure upfront.

ud
o Records can contain varying fields, accommodating non-uniform data easily.
2. Data Storage:
o Key-Value Stores: Store data as pairs of keys and values, allowing arbitrary data under a key.
o Document Databases: Allow for unstructured documents, enabling various data structures.
o Column-Family Databases: Permit dynamic addition of columns for each record.
o Graph Databases: Enable free addition of nodes and edges.
lo
3. Handling Non-Uniform Data:
o Schemaless databases allow records to have different fields without unnecessary nulls or meaningless
columns.
4. Implicit Schema:
o While schemaless databases do not enforce a schema, application code often relies on an implicit
C
schema to interpret and manipulate the data, leading to assumptions about field names and data types.
5. Decoupling of Schema and Storage:
o The schema is effectively handled in the application logic rather than within the database, which can
tu

create challenges in understanding data structure and consistency across multiple applications.

Advantages

• Rapid Iteration: Easy to modify data storage as project requirements evolve without needing extensive schema
changes.
V

• Dynamic Changes: New fields can be added as needed, and obsolete fields can be ignored without impacting
existing data.
• Simplicity: Simplifies the data insertion process, particularly in early development stages.

Disadvantages

• Complexity in Data Retrieval: Without a defined schema, it can be challenging to understand the structure of
the data, necessitating deep dives into application code.
• Lack of Consistency Checks: The database cannot enforce validations based on the schema, potentially leading
to inconsistent data manipulation across different applications.

ABHISHEK K N
• Difficulty in Data Management: Changing how data is stored or redefining aggregates can be as complex as
it is in relational databases.

Use Cases

• Rapidly Evolving Applications: Ideal for projects where data requirements change frequently and
unpredictably.
• Handling Varied Data: Suitable for applications dealing with diverse datasets, such as content management
systems, social media platforms, and IoT applications.

Conclusion

ud
Schemaless databases offer significant advantages in flexibility and adaptability, making them a popular choice in
dynamic development environments. However, they require careful management of implicit schemas and data integrity,
particularly when multiple applications access the same database.

7. Explain the Modeling for data access with respect to Key-value store
lo
Modeling for Data Access in Key-Value Stores
C
tu
V

Key-value stores provide a simple and efficient way to manage data through the use of unique keys paired with values.
When modeling data for access in key-value stores, several key considerations come into play:

1. Data Structure

• Key-Value Pair: Data is stored as pairs where each key serves as a unique identifier for the value. The value
can be a simple data type or a more complex object.
• Embedding Data: Related data can be embedded within a single value object. For instance, customer details
and their associated orders can be stored together to facilitate easy access.

ABHISHEK K N
2. Data Retrieval

• Reading Data: Accessing related data typically involves retrieving the entire object associated with a key. If
detailed information is required (like specific orders), the whole object needs to be fetched and processed.

3. Handling References

• Using References: To maintain relationships, separate data objects can include references to related entities.
For example, a customer object may contain references to their order IDs, enabling retrieval of all orders without
duplicating data.

ud
4. Denormalization for Read Optimization

• Aggregates: Denormalizing data—storing related information together—optimizes read performance. This

approach reduces the need for multiple read operations by combining relevant data in a single structure.

5. Considerations for Key-Value Models

•
lo
Performance: The design should prioritize read performance, often requiring the duplication or aggregation of
data to limit the number of read operations.
Simplicity: A straightforward structure is essential. Avoid overly complex nested objects to ensure easy access
and manipulation of data.
• Scalability: The model should accommodate future growth, allowing for new requirements without
C
necessitating significant changes to the existing structure.

8. Explain the following a) Column family store, with a neat diagram b) Emergence of NOSQL
tu

a) Column Family Store

Definition: Column-family stores, also known as column-family databases, are a type of NoSQL database designed to
handle large amounts of structured and semi-structured data. They organize data into column families, allowing for a
flexible schema and optimized read/write performance.
V

Structure: In a column-family store, data is organized in two levels:

1. Rows: Each row is identified by a unique key (row key).

2. Column Families: Each row contains one or more column families, which group related columns together.
Each column family can contain a variable number of columns.

Access Patterns:

ABHISHEK K N
• You can access either an entire row or specific columns within that row. For example, to retrieve a customer's
name, you might use a command like get('1234', 'name').

Advantages:

• Flexibility: New columns can be added to rows without requiring schema changes, making it adaptable to
evolving data needs.
• Performance: They are optimized for scenarios where reads involve accessing a few columns across many
rows, improving efficiency.
• Scalability: Designed for distributed architectures, enabling horizontal scaling and high availability.

ud
b) Emergence of NoSQL

Definition: NoSQL refers to a broad category of databases that do not primarily use SQL as their query language. These
databases are designed to handle large volumes of diverse data types and offer high performance and scalability.

Historical Context:

•
lo
The term "NoSQL" was first used in the late 1990s by Carlo Strozzi to describe an open-source relational
database that didn’t use SQL. However, this early use did not influence the contemporary understanding of
NoSQL.
C
• The modern concept of NoSQL emerged from a meetup in June 2009 in San Francisco, organized by Johan
Oskarsson. This event focused on discussing new database technologies inspired by systems like Google
BigTable and Amazon Dynamo.
tu

Key Points:

1. Initial Meetup: The meetup aimed to explore various projects that experimented with non-relational data
storage solutions.
2. Naming: The name "NoSQL" was chosen for its brevity and memorability for social media, rather than for a
V

definitive technical meaning.

3. Characteristics:
o NoSQL databases do not adhere to traditional SQL query structures.
o Most are open-source and built for clustered environments.
o They allow for dynamic schema changes, accommodating evolving data structures easily.
o They address challenges related to scalability and performance, especially with large datasets.
4. Polyglot Persistence: NoSQL is part of a trend towards polyglot persistence, where organizations use various
data storage solutions tailored to specific requirements.

ABHISHEK K N
MODULE-2

1. Write a note on: a) Single server b) Combine Sharding and Replication

a) Single Server

Definition: A single-server distribution model involves running a database on one machine, handling all data reads and
writes without distributing data across multiple servers.

Advantages:

1. Simplicity: Easier to manage with no complex network issues or data consistency problems.

ud
2. Developer-Friendly: Application developers find it simpler to work with a single server, avoiding the
challenges of distributed systems.
3. Cost-Effective: Lower operational costs compared to maintaining a cluster of servers.
4. Performance: Sufficient for applications with moderate data needs, especially if optimized.

Use Cases:

•
•
lo
Graph Databases: Best for handling complex relationships without the overhead of distribution.
Document and Key-Value Stores: Suitable for applications focusing on aggregates rather than heavy
concurrent access.
C
Conclusion: The single-server model is a practical choice for many applications, allowing organizations to keep
operations straightforward and efficient.
tu

b) Combining Sharding and Replication

Definition: Combining sharding and replication in a distributed database helps manage data more efficiently and ensures
availability.
V

Sharding:

• What It Is: Dividing a large dataset into smaller pieces (shards) that are spread across multiple servers for better
performance.

Replication:

• What It Is: Keeping copies of data on different servers to ensure that if one fails, the system can still access the
data from other servers.

ABHISHEK K N
How They Work Together:

• Each shard can be replicated across multiple nodes. For example:

o In master-slave replication, one node (master) handles data writing, while other nodes (slaves) keep
copies for reading.
o In peer-to-peer replication, every node can manage both reads and writes, with multiple copies of each
shard stored on different nodes.

Benefits:

1. Higher Availability: If one node fails, data is still accessible from other replicas.

ud
2. Load Balancing: Read requests can be distributed across replicas, reducing the load on master nodes.
3. Scalability: Allows the system to grow easily by adding more nodes and shards as data increases.

Conclusion: Combining sharding and replication creates a powerful, flexible data management system that enhances
performance and ensures data availability.

2. Explain the following i. CAP theorem ii. Quorums iii. Relaxing consistency iv. Relaxing durability

i. CAP Theorem
lo
The CAP Theorem, proposed by Eric Brewer, states that in a distributed data store, you can only guarantee two out of
three properties: Consistency, Availability, and Partition Tolerance.
C
• Consistency: All nodes see the same data at the same time, ensuring that every read receives the most recent
write.
tu

• Availability: Every request received by a non-failing node must result in a response, ensuring that the system
remains operational even if some nodes fail.
• Partition Tolerance: The system continues to operate despite network partitions that separate nodes and
prevent communication.

In practice, when network partitions occur (which is common), a system must choose between sacrificing consistency
V

or availability, leading to a compromise that aligns with the specific needs of the application.

ii. Quorums

Quorums are a strategy used in distributed systems to ensure data consistency during read and write operations. They
specify the minimum number of nodes required to confirm an operation to maintain a certain level of consistency.

ABHISHEK K N
• Write Quorum (W): The number of nodes that must acknowledge a write before it is considered successful.
To ensure strong consistency, W must be greater than half the total number of replicas (N), expressed as W>N/2.
• Read Quorum (R): The number of nodes that must be contacted to read the most recent data. To guarantee a
consistent read, the sum of the read and write quorums must also exceed the total number of replicas: R+W>N.

By defining quorums, systems can minimize the risk of reading stale data or encountering write-write conflicts,
balancing the trade-offs between consistency, availability, and performance.

iii. Relaxing Consistency

ud
Relaxing Consistency refers to the practice of allowing some level of inconsistency in order to improve system
performance and availability. In many systems, particularly distributed ones, achieving strong consistency can lead to
trade-offs in responsiveness or availability.

• Isolation Levels: In traditional databases, different transaction isolation levels (like read-committed or snapshot
isolation) can allow queries to access uncommitted data, thus sacrificing strict consistency for better

•
performance.
lo
Domain Tolerance: Different applications have varying tolerances for inconsistency. For example, e-
commerce systems might allow temporary discrepancies in shopping cart data, as long as users can continue
interacting with the system without delays.
C
In some cases, sacrificing consistency enables faster operations, like allowing overbooking in hotel reservations or
merging shopping carts during checkout, provided users can review their final orders.
tu

iv. Relaxing Durability

Relaxing Durability involves sacrificing some degree of data durability to achieve higher performance. While
durability ensures that once a transaction is committed, it will survive failures, in certain scenarios, this can lead to
V

unacceptable latency.

• In-Memory Operations: Systems can opt to keep data primarily in memory and periodically flush it to disk.
This improves responsiveness but risks losing data if a crash occurs before the latest changes are saved.
• Use Cases: For example, user session states may not require strict durability since losing session data is less
critical than ensuring a fast user experience. Similarly, telemetry data may prioritize capture speed over
complete durability.

ABHISHEK K N
3. Define Version stamps. Explain briefly about various approaches of constructing version stamps

Version Stamps

Definition:
Version stamps are metadata associated with data items that help track their versioning and changes over time. They are
crucial in distributed systems to manage consistency and conflicts, particularly when multiple nodes can update the
same data independently.

Approaches for Constructing Version Stamps

ud
1. Counters:
o Description: Each time a node updates data, it increments a counter and uses this value as the version
stamp.
o Use Case: Works well for master-slave replication models, where a single authoritative source controls
the versioning.
2. Timestamps: lo
o Description: Each update is tagged with the current time as the version stamp.
o Challenges:
▪ Difficult to ensure consistent time across distributed nodes.
▪ Cannot detect write-write conflicts effectively, making it suitable primarily for single-master
systems.
C
3. Version Stamp Histories:
o Description: All nodes maintain a history of version stamps, allowing them to track the relationships
between different versions.
tu

o Use Case: Effective in distributed version control systems, where clients or servers store the history to
detect inconsistencies.
4. Vector Stamps (Vector Clocks/Version Vectors):
o Description: A vector stamp consists of a set of counters, with one counter for each node in the system.
Each node increments its own counter upon an update.
V

o Synchronization: Nodes synchronize their vector stamps during communication. This allows
comparison of version stamps:
▪ If all counters of one stamp are greater than or equal to another, it is the newer version.
▪ If both stamps have counters greater than the other, it indicates a write-write conflict.
o Flexibility: Missing values in the vector are treated as zero, which facilitates the addition of new nodes
without invalidating existing stamps.

ABHISHEK K N
4. Explain Version stamps on multiple nodes.

Version stamps are essential in managing data consistency across multiple nodes in a distributed system, particularly
in a peer-to-peer model. Here’s a concise breakdown based on the provided text:

Basic Concepts

1. Single Authoritative Source: In a master-slave setup, version stamps are straightforward as they are managed
by the master. Slaves simply follow the master's version stamps.
2. Peer-to-Peer Challenges: In a decentralized system, multiple nodes can update data independently, leading to

ud
potential inconsistencies. When querying multiple nodes, they might return different versions of the same
data.

Managing Inconsistencies

• Version Stamps with Counters: Each node maintains a counter that increments with each update. This
allows nodes to determine which version is more recent by comparing counter values.
•
lo
Multiple-Master Cases: In scenarios where all nodes can act as masters, a more sophisticated approach is
necessary. Keeping a history of version stamps enables nodes to verify the relationships between updates (e.g.,
checking if one update is an ancestor of another).
C
Approaches to Versioning

1. Timestamps: While timestamps can denote the order of updates, they often fail due to time synchronization
issues across nodes and cannot effectively detect conflicts.
tu

2. Vector Stamps: This method uses an array of counters (one for each node). For example, a vector stamp for
three nodes might look like [blue: 43, green: 54, black: 12]. Each time a node updates, it
increments its respective counter, allowing nodes to synchronize their vector stamps during communication.

Comparing Vector Stamps

• A vector stamp is considered newer if all its counters are greater than or equal to those in an older stamp.
• If two stamps contain higher values for different counters, a conflict arises (e.g., [blue: 1, green: 2,
black: 5] vs. [blue: 2, green: 1, black: 5]).

Handling Missing Values

• Missing values in a vector are treated as zero (e.g., [blue: 6, black: 2] becomes [blue: 6,
green: 0, black: 2]), facilitating the addition of new nodes without disrupting the existing system.

ABHISHEK K N
5. Explain the following a) Sharding b) Master-Slave replication c) Peer-to-peer replication

a) Sharding

Definition: Sharding is a horizontal scaling technique that involves distributing parts of a dataset across multiple servers
(shards). Each shard is responsible for a subset of the data, allowing multiple users to access different data concurrently.

Key Points:

• Load Distribution: Ideal sharding allows users to access different server nodes, balancing the load evenly
across servers.

ud
• Aggregate Orientation: Data that is frequently accessed together is grouped into aggregates, enhancing
performance by minimizing cross-node requests.
• Data Placement: Factors such as physical location and access patterns can inform where data is stored. For
instance, data related to Boston residents can be placed in an eastern U.S. data center.
• Challenges: Sharding can complicate application logic and may require rebalancing, which involves migrating
data and updating code. lo
• Resilience: Sharding alone does not improve resilience, as a node failure renders its shard’s data unavailable. It
can, however, limit the impact of such failures to affected users.
• Planning: Transitioning from a single-node setup to sharding should be done early, allowing sufficient
headroom for the change without overwhelming the system.
C
b) Master-Slave Replication
tu

Definition: Master-slave replication involves creating a primary node (master) that handles all writes, while one or more
secondary nodes (slaves) replicate the master’s data and can serve read requests.

Key Points:

• Scaling Reads: This model is particularly effective for read-intensive datasets, as multiple slaves can handle
V

read requests, offloading work from the master.

• Read Resilience: If the master fails, slaves can still respond to read requests, providing some continuity of
service.
• Write Limitations: All write operations must go through the master, which can become a bottleneck and a
single point of failure.
• Automatic Failover: Slaves can be promoted to master if the primary fails, but this requires careful
management of read and write paths to ensure continued operation.
• Inconsistency Risks: There’s a potential for inconsistency, as slaves may not always reflect the most recent
writes from the master. Clients may read stale data or miss writes made just before a master failure.

ABHISHEK K N
c) Peer-to-Peer Replication

Definition: In peer-to-peer replication, all nodes in the system are equal, allowing each node to accept reads and writes,
thus eliminating the single point of failure associated with master-slave setups.

Key Points:

• Fault Tolerance: This model increases resilience since the failure of any single node does not prevent access
to data, improving overall system availability.

ud
• Write Scalability: Since every node can accept writes, the system can scale horizontally to handle increased
write loads effectively.
• Consistency Challenges: A major challenge is maintaining consistency, as simultaneous writes to different
nodes can lead to conflicts (write-write conflicts).
• Conflict Resolution: Approaches to handle inconsistencies include:
o Coordinating writes to ensure no conflicts arise, which may require additional network traffic.
lo
o Allowing inconsistent writes and later merging them based on application-specific rules or policies.
• Trade-offs: There’s a spectrum of options between strict consistency and high availability, where systems must
balance the two based on application needs.

6. What are Distribution models? Briefly explain two paths of data distribution
C
Distribution models refer to the ways data is spread across multiple servers or nodes in a database system. They are
crucial for handling large amounts of data and increased user traffic. By distributing data, systems can improve
performance and availability, but it can also add complexity.
tu

Two Paths of Data Distribution

1. Replication:
o What it is: This involves making copies of the same data on multiple nodes.
V

o Types:
▪ Master-Slave Replication: One main server (master) handles all the writing, while other
servers (slaves) replicate this data and handle reading. This helps with read traffic but is limited
by the master’s write capacity.
▪ Peer-to-Peer Replication: All servers have equal status and can handle both reads and writes.
This improves fault tolerance and scalability but can complicate data consistency.
2. Sharding:
o What it is: This involves splitting the dataset into smaller pieces (shards), each stored on different
nodes.

ABHISHEK K N
o How it works: Each shard contains a portion of the overall data, allowing servers to handle specific
requests. This improves performance by reducing load and contention since each server deals with only
part of the data.

7. Explain about Update consistency and read consistency with an example.

Update Consistency and Read Consistency are key concepts in database management, focusing on how data is
modified and accessed when multiple users interact with it.

Update Consistency

ud
Update Consistency ensures that when multiple users try to change the same data, the final outcome is correct.

Example:

• Scenario: Martin and Pramod both want to update the same phone number.
o Martin enters "123-456-7890," and Pramod enters "(123) 456-7890" at the same time.

Conflict:

•
lo
If Martin's update is processed first, Pramod's update will overwrite it, causing Martin's change to be lost. This
is called a lost update.
C
Resolution Approaches:

• Pessimistic: Use write locks to ensure only one user can update at a time, preventing conflicts.
• Optimistic: Allow both updates but check for changes before saving. If Pramod tries to save after Martin, his
tu

update fails, and he must check the new value first.

Read Consistency

Read Consistency ensures users see accurate data, especially during concurrent updates.
V

Example:

• Scenario: Martin adds a line item to his order, affecting the shipping charge. Meanwhile, Pramod reads the
order and charge.
o If Pramod reads while Martin’s update is happening, he might see outdated information, leading to a
read-write conflict.

Prevention:

ABHISHEK K N
• Transactions: If Martin uses a transaction for his updates, Pramod will either see all changes before or after,
avoiding inconsistencies.

Replication Issues:

• In distributed systems, different nodes might show outdated data. For example, if a hotel room gets booked, one
user might see it available while another sees it booked, leading to eventual consistency, where updates
eventually sync across all nodes.

Session Consistency:

ud
• This ensures that once a user updates data, any further reads in the same session show the most recent updates.

lo
C
tu
V

ABHISHEK K N
MODULE 3

1. Explain with a neat diagram Partitioning and Combining in Map reduce 10M
Partitioning and Combining in MapReduce
In MapReduce, the partitioning and combining phases are essential for optimizing performance,
especially when dealing with large datasets and multiple nodes. Here's an explanation along with a
diagram of how they work:
Partitioning in MapReduce
1. Partitioning refers to the division of the output of the map tasks into distinct groups, based on

ud
the key.
2. After the mappers process the data, the results are divided into partitions (or "buckets"), where
each partition contains results for a specific key or a set of keys.
3. These partitions are then sent to separate reducers, allowing parallel processing of the data. This
is beneficial because it means multiple reducers can process data simultaneously, increasing the
overall performance and scalability of the system.
lo
4. The partitioning phase ensures that each reducer handles a specific subset of the data, making
the reduce function operate only on a relevant set of key-value pairs.
Combining in MapReduce
1. Combining is a technique used to reduce the amount of data that needs to be transferred
between the mappers and reducers. Often, the same key will appear multiple times in the output
C
of a map task, leading to redundant data transfer.
2. A combiner function is applied to reduce this redundancy by merging values for the same key
before the data is sent to the reducer.
3. The combiner function is essentially a mini-reducer that reduces the data on the mapper's side.
tu

This reduces the amount of data being transferred across the network and speeds up the overall
process.
4. Not all reducers are combinable. A combinable reducer must produce the same kind of output
as its input. If the output structure differs (e.g., counting unique items), combining may not be
possible.
V

ABHISHEK K N
ud
2. Explain Basic Map-Reduce with a neat diagram 7M
Basic Map-Reduce
Map-Reduce is a computational model used to process large amounts of data in parallel across a
distributed cluster. It divides the computation into two main steps: the Map step and the Reduce step.
lo
Below, we explain the basic concept of Map-Reduce with an example and a diagram.
C
tu
V

Problem Scenario:
Let’s consider an example of a sales report system where we have orders, each containing line items.
Each line item has the following attributes:
• Product ID
• Quantity
• Price Charged

ABHISHEK K N
We want to calculate the total revenue for each product over the last seven days. This requires
analyzing all the orders, but since the data is sharded across multiple machines, we need to use Map-
Reduce to perform the analysis in a distributed manner.
Map Function:
• Input: The map function takes an order (or a record) as input.
• Output: The map function emits a series of key-value pairs. In this case, the key is
the Product ID, and the value is an embedded structure containing the quantity and price of
the product.
For example, if an order contains two line items for a product, the map function will emit:
• Key: Product ID (e.g., "P123")

ud
• Value: (Quantity, Price)
These key-value pairs are emitted for each line item in every order.
Reduce Function:
• Input: The reduce function takes all key-value pairs with the same key (product ID) that were
emitted by the map function. It aggregates all values associated with that key.
lo
• Output: The reduce function combines the quantities and prices for each product to calculate
the total quantity sold and the total revenue for that product.
For example, if the map function emits multiple values for the product ID "P123" (e.g., (5, 100), (3,
120)), the reduce function will sum the quantities and prices to get the total revenue for that product.
C
tu
V

Steps in the Process:

1. Mapping: Each map task runs independently on a subset of the data (orders). It emits key-value
pairs where the key is the product ID, and the value is the quantity and price of the product.
2. Shuffling and Sorting: The MapReduce framework automatically groups all the key-value
pairs with the same key (Product ID) together.
3. Reducing: The reduce function takes these grouped key-value pairs, aggregates the values (e.g.,
sums up the quantities and calculates the total revenue), and produces the final output.

ABHISHEK K N
3. Explain two stage Map-Reduce example with a neat diagram 10M
Two-Stage Map-Reduce Example
As the complexity of calculations increases, it's often helpful to break down a Map-Reduce job
into multiple stages. This approach allows each stage to focus on specific tasks, passing the output of
one stage as input to the next, similar to the pipes-and-filters model in UNIX. Below is an explanation
and diagram of a two-stage Map-Reduce example.

ud
lo
C
tu
V

ABHISHEK K N
ud
Example Scenario:
lo
We want to compare the sales of products for each month in 2011 with the sales for the same months in
2010. This requires calculating the monthly sales for each product in both 2010 and 2011 and then
comparing them.
C
Stage 1: Aggregating Monthly Sales Data by Product:
The first stage involves processing the raw order records to aggregate product sales for each month in
the year. Here’s how it works:
tu

• Input: Raw order records, which contain product details like product ID, quantity, price, and
date of the order.
• Output: Key-value pairs where:
• Key: Composite key combining product ID and month (e.g., ("P123", "Jan 2011"))
• Value: The quantity and total revenue for that product in that month.
V

For example:
• Input Order: Product "P123" sold 5 units at $100 in January 2011.
• Map output: Key ("P123", "Jan 2011"), Value (Quantity: 5, Revenue: 500).
The map function is applied to every order, producing a key-value pair for each product in each month.
Stage 2: Comparing Sales for the Same Month in Different Years:
In the second stage, we take the output of the first stage (aggregated product sales by month) and
perform a comparison between the same months in 2011 and 2010.
• Input: The output from Stage 1, which contains product sales for each month in 2010 and 2011.

ABHISHEK K N
• Processing: The second-stage mapper groups records by year and month. It then populates
values for the current year (2011) and the previous year (2010) for each product.
• For records from 2011, the quantity and revenue are associated with the current year.
• For records from 2010, the quantity and revenue are associated with the prior year.
• Output: Key-value pairs for each product, where the key is the product ID, and the value is a
tuple containing the sales data for 2011 and 2010.
Final Reduce Step:
• Input: Key-value pairs from Stage 2, grouped by product ID. Each product’s value will contain
two records (one for 2011 and one for 2010).

ud
• Processing: The reduce function takes the values for the same product from both 2010 and
2011 and computes the difference, such as the percentage change in sales between the two
years.
• Output: Final aggregated result, which shows the difference in sales for each product between
2010 and 2011.

4. How are calculations composed in Map-Reduce? Explain with a neat diagram 8M

Composing Map-Reduce Calculations
lo
C
tu
V

In the Map-Reduce model, calculations are composed by structuring them around two main
tasks: Map and Reduce. Each stage of the calculation is carefully designed to work within the
constraints of the model, where:
• The Map function operates on a single aggregate (e.g., an order), producing key-value pairs.
• The Reduce function operates on all key-value pairs associated with a particular key (e.g., all
the values for a given product).
These calculations are structured to efficiently break down tasks into independent units that can be
processed in parallel. However, certain calculations, such as averages, require specific strategies for
composition because not all operations are composable in a simple manner.
Example: Calculating the Average Ordered Quantity

ABHISHEK K N
1. Map Step
In the map step, the function emits key-value pairs based on the input data (e.g., orders). If we're
calculating the average ordered quantity for products, each map output would be:
• Key: Product ID
• Value: The ordered quantity and a count of 1
This allows the data to be distributed across different nodes for parallel processing, where each node
handles different products independently.
2. Intermediate Output
After the map step, the output consists of key-value pairs where the key is the product ID, and the value

ud
is a tuple containing the total quantity of orders for the product and a count of how many orders
contributed to that total.
3. Reduce Step
In the reduce step, all the values for a particular product are aggregated. The reducer:
• Sums up the total quantities and counts for that product across all mapped outputs.
• Calculates the average by dividing the total quantity by the total count.
lo
This step can also be seen as a merge of the partial results (sum and count) and the final calculation
(average).

5. What are Key-Value stores? List out some popular key value database. Explain how all the data
C
is stored in a single bucket of a key value data store 5M/8M
Key-Value Store Explanation
In key-value stores like Riak, data is stored in buckets. A bucket is a logical grouping or container for
a set of keys and their associated values. Each key-value pair is stored in a flat namespace, meaning
tu

each key is unique within a bucket, and its associated value can be any data type or object.
Popular Key-Value Databases:
• Riak
• Redis
• Memcached DB
V

• Berkeley DB
• HamsterDB
• Amazon DynamoDB
• Project Voldemort
Data Storage in a Single Bucket
In the image you provided, we see a bucket named userData, which contains multiple types of data.
The key is represented by the sessionID, and the value is an object that contains multiple aggregates:

ABHISHEK K N
1. UserProfile
2. SessionData
3. ShoppingCart, which further contains multiple CartItems.
This scenario demonstrates how a single key in a key-value store can hold multiple objects (or
aggregates) within its value. These objects are stored as part of the same key-value pair under
the sessionID key in the userData bucket. This can be convenient for storing all session-related
information in a single place but can lead to potential issues like key conflicts or difficulties in
accessing specific parts of the data if the bucket grows too large.
Managing Data in a Single Bucket

ud
lo
C
tu

Figure 8.1. Storing all the data in a single bucket

• Avoiding Key Conflicts: To manage different data types within the same bucket and avoid key
conflicts, it's common practice to segment the data by appending the type of object to the key,
V

for example, sessionID_userProfile or sessionID_shoppingCart.

• Domain Buckets: For more complex structures or to avoid key conflicts, domain buckets can
be used, where each type of object (e.g., UserProfile, ShoppingCart) resides in its own
domain bucket, improving data organization and reducing the chance of conflicts.
This setup allows for flexibility in how you store and retrieve data, making key-value stores suitable for
use cases that require quick lookups and scalable storage systems.

ABHISHEK K N
6. List and explain any 2 features of Key-value store 5M
ANY TWO AMONG THIS
1. Consistency
• Definition: Consistency refers to how data is synchronized across all nodes in a distributed
system.
• Understanding: In a key-value store like Riak, consistency is often eventually achieved,
meaning that updates to data may not immediately reflect across all nodes but will eventually
sync. Riak provides options to handle write conflicts, like the "last write wins" or "siblings"
model, where multiple conflicting versions can be returned for client resolution.
• Trade-off: A strong consistency model can affect write performance. You can tune consistency

ud
settings like W (number of nodes for successful write) and R (number of nodes for successful
read) for balancing between consistency and availability.
2. Transactions
• Definition: Transactions refer to ensuring that multiple operations are treated as a single unit,
either all succeeding or all failing.
• Understanding: Key-value stores generally do not provide full ACID transactions (like
relational databases), but some do offer simplified transactional features. In Riak, this is
lo
implemented using quorum-based writes where a write is considered successful only if it’s
confirmed by a quorum (a specified number of nodes). This allows for write tolerance even if
some nodes are unavailable.
• Trade-off: While this improves availability, it can sacrifice strong consistency because not all
nodes may have the latest data at any given time.
C
3. Query Features
• Definition: Query features define how data can be retrieved or queried within the database.
• Understanding: Traditional relational databases allow complex queries across multiple fields,
tu

but key-value stores are much simpler. They support querying only by key, not by other
attributes inside the value. Some key-value stores, like Riak, provide an indexing feature
(like Riak Search) to allow querying inside values. However, in most key-value stores, you
can only fetch data by the key, and ad-hoc queries are not as flexible.
• Trade-off: This makes key-value stores suitable for use cases where data is frequently accessed
via a known key, such as session data, shopping carts, or user profiles. The simplicity of key-
based queries also leads to faster retrieval times for those keys.
V

4. Structure of Data
• Definition: This refers to how the data is stored inside the key-value store, particularly the value
part of the key-value pair.
• Understanding: In key-value stores, the value can be any data structure, from a simple blob of
text to more complex formats like JSON or XML. The key itself is used for quick access, and
the value can be anything the application needs. For example, in Riak, the data is not
constrained to a specific format, and you can store structured data like user profiles or session
data.

ABHISHEK K N
• Trade-off: The lack of structure in the data format allows flexibility, but it also means the
application must understand how to interpret the value, unlike structured databases that impose
a schema.
5. Scaling
• Definition: Scaling refers to the ability of the database to handle increased loads by adding
more resources, such as nodes or servers.
• Understanding: Key-value stores often scale horizontally, meaning they distribute data across
multiple nodes. This is achieved through sharding, where each key is assigned to a specific
node based on a hashing function. As the data grows, more nodes can be added to the cluster to
maintain performance and handle higher traffic.

ud
• Trade-off: Sharding improves performance but introduces the challenge of data availability
during node failures. Systems like Riak manage this through replication, where each piece of
data is stored on multiple nodes to ensure availability even when some nodes fail. However,
managing consistency during failures and ensuring that the right data is always accessible can
be complex.
6. Availability and Fault Tolerance
• Definition: This refers to the ability of the database to remain operational even when parts of
lo
the system fail.
• Understanding: Key-value stores like Riak are designed to be highly available and fault-
tolerant. By replicating data across multiple nodes, the system ensures that even if one or more
nodes fail, the data remains accessible from other nodes.
• Trade-off: This replication can affect consistency (i.e., how up-to-date the data is across all
C
nodes), and trade-offs between availability and consistency can be adjusted based on needs (as
per the CAP Theorem).
7. Data Expiry
• Definition: Data expiry refers to the ability to automatically remove or expire data after a
tu

certain period.
• Understanding: Some key-value stores allow setting an expiration time for a key-value pair,
such as the expiry_secs feature in Riak. This is useful for temporary data like session data or
shopping cart information.
• Trade-off: Data expiration can help keep the database clean and ensure that temporary data is
not stored longer than necessary. However, this feature may not be available in all key-value
V

stores or could require manual management.

8. Simplified Key Design
• Definition: Key-value stores generally require careful consideration of how keys are generated
and used.
• Understanding: In key-value stores, the design of keys is critical since querying is done based
on keys. Developers must decide if keys are generated by the system (e.g., via a hash) or
provided by the user (e.g., user ID, email).
• Trade-off: A well-designed key system can make data retrieval efficient, but poor key design
can lead to key conflicts and performance issues, particularly in large datasets.

ABHISHEK K N
7. Elaborate the suitable use cases of Key-value store. When Key-value stores are not suitable.
Explain
When to Use Key-Value Stores
1. Storing Session Information
• Use Case: Web applications store session data (e.g., user ID, preferences) using a
unique session ID.
• Why It Works: Key-value stores allow fast retrieval of all session data with one
request. For example, Memcached or Riak can store and retrieve session data quickly.
2. User Profiles and Preferences

ud
• Use Case: Store user settings, such as language, timezone, and preferences.
• Why It Works: Each user can have a unique key (e.g., user ID), and all their
preferences can be stored and retrieved easily in one operation.
3. Shopping Cart Data
• Use Case: E-commerce sites can store shopping cart contents tied to a user.
• Why It Works: The user's shopping cart data is stored under their user ID and can be
lo
accessed across devices, making it fast and persistent.

When Not to Use Key-Value Stores

1. Relationships Among Data
C
• Issue: If your data has complex relationships (e.g., customer orders linked to products),
key-value stores are not ideal.
• Why Not: Key-value stores don't support relationships or joins between data items.
2. Multioperation Transactions
tu

• Issue: If you need to perform multiple operations on different keys at once and need to
ensure all succeed or fail together, key-value stores aren't suitable.
• Why Not: Key-value stores typically can't handle multi-step transactions or rollbacks.
3. Querying by Data
V

• Issue: If you need to search for data based on the value (e.g., finding all users with a
certain preference), key-value stores don't support this.
• Why Not: They only allow querying by the key, not the data inside the value.
4. Operations on Multiple Keys
• Issue: If you need to perform actions across multiple keys (e.g., batch updates), key-
value stores can't do this directly.
• Why Not: Key-value stores operate on one key at a time, so any bulk operation must
be handled in the application itself.

ABHISHEK K N
Module 4
1. What are Document databases? Explain with an example. List and explain any 2
features of document databases 10M

9.1 What Is a Document Database?

A document database is a type of NoSQL database that stores, retrieves, and manages data in
the form of documents, typically in JSON, BSON, or XML formats. Each document is a self-
contained unit of data that can include fields, arrays, and nested sub-documents.

Unlike traditional relational databases (RDBMS), document databases allow for schema

ud
flexibility. Documents within the same collection can have varying structures, eliminating the
need for a predefined schema. This makes document databases ideal for handling unstructured
and semi-structured data.

Example:

Document 1:

{
"firstname": "Martin",
lo
"likes": ["Biking", "Photography"],
"lastcity": "Boston"
C
}

Document 2:

{
tu

"firstname": "Pramod",
"citiesvisited": ["Chicago", "London", "Pune", "Bangalore"],
"addresses": [
{ "state": "AK", "city": "DILLINGHAM", "type": "R" },
{ "state": "MH", "city": "PUNE", "type": "R" }
],
V

"lastcity": "Chicago"
}

• Documents with differing fields (e.g., likes in Document 1 and addresses in Document
2) coexist in the same collection.
• The schema is flexible, and unused fields are simply omitted.

9.2 Features

ABHISHEK K N
Document databases offer a variety of powerful features. MongoDB, one of the most popular
document databases, serves as a representative example.

9.2.1 Consistency

• Consistency in MongoDB is achieved using replica sets, which allow data to be

replicated across multiple nodes.
• Write operations can specify the number of servers (or nodes) the write must propagate
to before being reported as successful.

ud
Example:

db.runCommand({ getlasterror: 1, w: "majority" });

• w: "majority" ensures that the write propagates to a majority of nodes in the replica
set before success is confirmed.
• Read operations can be executed from slave nodes by enabling slaveOk.

Code Example for Reads:

lo
DBCursor cursor = collection.find(query).slaveOk();
C
This improves read performance by distributing reads across multiple nodes.

9.2.2 Transactions
tu

• Document databases do not provide traditional multi-operation ACID transactions

like RDBMS.
• Transactions at the single-document level are atomic—a single write operation either
succeeds or fails entirely.
• For stronger write safety, WriteConcern can be configured to control the level of
V

replication before success is confirmed.

Example:

shopping.setWriteConcern(WriteConcern.REPLICAS_SAFE);
shopping.insert(order, WriteConcern.REPLICAS_SAFE);

• REPLICAS_SAFE ensures the data is written to the primary and at least one secondary
node.

ABHISHEK K N
9.2.3 Availability

• Document databases like MongoDB use replication to achieve high availability.

• A replica set consists of multiple nodes with one primary (master) and several
secondaries (slaves).
• If the primary node goes down, the replica set automatically elects a new primary,
ensuring availability without manual intervention.

Key Points:

ud
• Applications do not need to detect or manage node failures; the MongoDB driver
handles communication with the new primary.
• Replication provides:
o Data redundancy
o Failover
o Read scaling lo
9.2.4 Query Features

Document databases support powerful querying capabilities, including:

C
• Querying fields inside documents and embedded sub-documents.
• Performing filtering, sorting, and aggregation operations.
tu

Examples:

• SQL:

SELECT * FROM order WHERE customerId = "883c2c5b4e5b";

• Equivalent MongoDB Query:

db.order.find({ "customerId": "883c2c5b4e5b" });

• Querying embedded documents:

Find orders where a product name contains "Refactoring":

db.orders.find({ "items.product.name": /Refactoring/ });

• MongoDB simplifies queries for nested fields as documents are aggregated objects.

9.2.5 Scaling

ABHISHEK K N
Document databases scale horizontally to handle increasing data and load:

1. Read Scaling:
o Add more read slaves to a replica set.
o Distribute read queries across secondary nodes using slaveOk.

Example:

rs.add("newNode:27017");

2. Write Scaling (Sharding):

ud
o Data can be sharded (partitioned) across multiple nodes based on a shard key.
o Shards are dynamically rebalanced when new nodes are added.

Example of Sharding:

db.runCommand({ shardcollection: "ecommerce.customer", key: { firstname: 1 } });

• Sharding enables distribution of data and write operations across multiple servers.
•
lo
Each shard can also be a replica set, combining scaling with high availability.

Or Explain the following features of document database

i. Consistency
C
ii. Transaction

i. Consistency

Consistency in MongoDB is configured using replica sets and controlling the behavior of
tu

write operations. Every write operation can specify how many servers must acknowledge the
write before it is considered successful, ensuring different levels of consistency.

• Replica Sets:
A replica set is a group of MongoDB nodes where data is replicated asynchronously.
Writes are sent to the primary node and then propagated to secondary nodes.
V

• Write Concern:
Write consistency is achieved by setting the w parameter to specify how many nodes
must acknowledge the write.
Example:

javascript
Copy code
db.runCommand({ getlasterror: 1, w: "majority" });

o If there is only one node, w: "majority" will return immediately.

ABHISHEK K N
o
For three nodes, the write must be acknowledged on at least two nodes for
success.
• Read Consistency:
To improve read performance, reads can be performed from secondary nodes using
the slaveOk parameter, though this may result in slightly stale data.

Example:

java
Copy code
DBCollection collection = getOrderCollection();

ud
BasicDBObject query = new BasicDBObject();
query.put("name", "Martin");
DBCursor cursor = collection.find(query).slaveOk();

ii. Transactions
lo
In traditional RDBMS, transactions involve multiple operations (insert, update, delete) that
are committed or rolled back together. Document databases, like MongoDB, primarily
support atomic transactions at the single-document level.

• Single-Document Transactions:
Writes in MongoDB are atomic at the document level. A single write operation
C
involving embedded documents is treated as a single unit.
• WriteConcern for Finer Control:
By using WriteConcern, MongoDB allows you to control the safety level of
writes. For example, you can ensure that a write is replicated to multiple nodes before
tu

being acknowledged.

Example:

java
Copy code
V

WriteResult result = shopping.insert(order,

WriteConcern.REPLICAS_SAFE);

• REPLICAS_SAFE ensures the write is acknowledged by the primary and at least

one secondary node.
• Limitations of Multi-Document Transactions:
Transactions involving multiple operations or documents across collections are
generally not supported in NoSQL solutions. Writes either succeed or fail
individually. However, some databases like RavenDB offer multi-operation
transaction support.

ABHISHEK K N
Key Trade-off:
MongoDB focuses on performance and scalability over traditional multi-document
transaction guarantees.

2. Briefly explain Scaling feature of document database, with a neat diagram 10M

Scaling Feature of Document Database

Scaling in a document database allows handling more load without simply migrating to a
larger server. Instead, the focus is on features within the database to horizontally scale reads
and writes.

ud
1. Horizontal Scaling for Reads

• Scaling for heavy-read loads is achieved by adding more read slaves to a replica set.
• Reads are directed to the slave nodes to distribute the load.
•
lo
Adding new nodes can increase the read capacity without downtime.
C
tu

Figure 9.2. Adding a new node, mongo D, to an existing replica-set cluster

• Example: In a 3-node replica set, adding a new slave node like mongo D improves
read performance.
V

Command to Add a New Node:

rs.add("mongod:27017")

2. Horizontal Scaling for Writes

• Scaling for writes is achieved using sharding, which splits data across multiple
shards based on a shard key.

ABHISHEK K N
• Each shard can also be a replica set for fault tolerance and improved read
performance.
• The shard key ensures the data is evenly distributed for optimal performance.
• New shards can be added to the cluster, and MongoDB will automatically rebalance
the data across shards.

Sharding Command Example:

db.runCommand( { shardcollection : "ecommerce.customer", key :

{ firstname : 1 } } )

ud
lo
Figure 9.3. MongoDB sharded setup where each shard is a replica set
C
Summary:

• Read Scaling: Adding read slaves in a replica set.

• Write Scaling: Using sharding to distribute data across multiple shards.
tu

• Scaling is achieved without downtime, making MongoDB capable of handling heavy

loads.

3. Elaborate the suitable use cases of document database. When document databases are
not suitable. Explain
V

Suitable Use Cases of Document Database

Document databases are highly flexible and can store semi-structured data without predefined
schemas, making them ideal for several scenarios where data requirements evolve over time.
Below are some suitable use cases:

Event Logging

• Use Case: Many enterprise applications require event logging for various types of
activities such as transactions, user actions, or system events.

ABHISHEK K N
• Why Suitable: Document databases can store different types of events in a central
store without enforcing a strict schema.
• Sharding: Events can be efficiently sharded based on the application's name (e.g.,
app1, app2) or event type (e.g., order_processed, customer_logged).
• Benefit: Document databases easily accommodate the changing structure of event
data over time.

Content Management Systems (CMS) and Blogging Platforms

• Use Case: Websites and platforms for publishing articles, managing user comments,
user profiles, and other web-facing content.

ud
• Why Suitable: Document databases can store JSON-like documents, which are ideal
for representing dynamic and hierarchical data such as:
o Blog articles
o User-generated content
o User registrations and profiles.
• Benefit: No predefined schema makes it easier to accommodate frequent content
changes. lo
9.3.3. Web Analytics or Real-Time Analytics

• Use Case: Storing and processing real-time analytics data like page views, unique
C
visitors, or user interactions.
• Why Suitable:
o Parts of documents can be updated easily.
o New metrics or fields can be added without modifying a strict schema.
tu

• Benefit: Suitable for real-time, schema-flexible analytics where metrics evolve over
time.

9.3.4. E-Commerce Applications

• Use Case: Managing product catalogs, orders, and customer data in e-commerce
platforms.
• Why Suitable:
o E-commerce applications often require flexible schemas for dynamic product
details and evolving order structures.
o Document databases allow changes in data models without expensive data
migrations.
• Benefit: Scalability and flexibility make document databases ideal for handling large
product catalogs and customer data.

ABHISHEK K N
4. When Document Databases Are Not Suitable

Despite their advantages, there are certain scenarios where document databases may not be
the best solution:

9.4.1. Complex Transactions Spanning Different Operations

ud
• Issue: Document databases do not inherently support atomic cross-document
operations (transactions involving multiple documents).
• Example: Applications requiring ACID transactions across different records or
collections.
• Exception: Some document databases (e.g., RavenDB) provide partial support for
complex transactions.
• Why Unsuitable: The lack of atomic operations limits use in applications requiring
lo
high transaction consistency.

9.4.2. Queries Against Varying Aggregate Structures

C
• Issue: Document databases save data in the form of aggregates (a self-contained
entity). Flexible schema allows changes to data structures, but this becomes
problematic when:
o Queries are ad hoc and change frequently.
tu

o The design of the aggregate structure is constantly evolving.

• Solution: In such cases, data may need to be normalized to its lowest level of
granularity.
• Why Unsuitable: When aggregates change constantly, querying and managing such
structures becomes inefficient compared to relational databases.
V

4. Describe some example queries to use with document database 10M

4. Example Queries to Use with Document Databases

Document databases like MongoDB use JSON-like documents to store data and support
flexible queries. Below are some common example queries that demonstrate typical
operations such as CRUD (Create, Read, Update, and Delete), filtering, and aggregation.

ABHISHEK K N
4.1. Insert Documents (Create Operation)

To add a new document to a collection, you use the insertOne() or insertMany()

command.

Example 1: Insert a single document

javascript
Copy code
db.customers.insertOne({

ud
firstname: "John",
lastname: "Doe",
email: "[email protected]",
orders: [
{ item: "laptop", price: 1200 },
{ item: "mouse", price: 25 }
],
location: "USA" lo
});

This inserts a new document into the customers collection.

Example 2: Insert multiple documents

C
javascript
Copy code
db.customers.insertMany([
{ firstname: "Alice", lastname: "Smith", location: "UK" },
{ firstname: "Bob", lastname: "Jones", location: "Canada" }
tu

]);

4.2. Query Documents (Read Operation)

To find data in a document database, you use the find() function with optional filters.
V

Example 3: Find all documents in a collection

javascript
Copy code
db.customers.find();

This retrieves all documents in the customers collection.

Example 4: Find documents with a specific condition

javascript

ABHISHEK K N
Copy code
db.customers.find({ location: "USA" });

This query retrieves all documents where the location field is "USA".

Example 5: Find documents with multiple conditions

javascript
Copy code
db.customers.find({
$and: [
{ location: "USA" },

ud
{ "orders.price": { $gt: 100 } }
]
});

This finds all customers in the USA who have at least one order with a price greater than 100.

lo
4.3. Update Documents (Update Operation)

To modify existing documents, you use the updateOne(), updateMany(), or replaceOne()

commands.

Example 6: Update a single document

C
javascript
Copy code
db.customers.updateOne(
{ firstname: "John" }, // Filter
tu

{ $set: { email: "[email protected]" } } // Update operation

);

This updates John’s email address.

Example 7: Update multiple documents

javascript
Copy code
db.customers.updateMany(
{ location: "USA" }, // Filter
{ $set: { status: "Active" } } // Add a new field 'status' with value
'Active'
);

This adds or updates the status field for all customers located in the USA.

ABHISHEK K N
4.4. Delete Documents (Delete Operation)

To remove documents, you use the deleteOne() or deleteMany() commands.

Example 8: Delete a single document

javascript
Copy code
db.customers.deleteOne({ firstname: "Alice" });

This deletes the document where firstname is "Alice".

ud
Example 9: Delete multiple documents

javascript
Copy code
db.customers.deleteMany({ location: "Canada" });

This deletes all documents where location is "Canada".

lo
4.5. Aggregation Queries

Document databases allow data aggregation to perform operations like grouping, sorting, and
C
counting.

Example 10: Group by location and count customers

javascript
tu

Copy code
db.customers.aggregate([
{ $group: { _id: "$location", count: { $sum: 1 } } }
]);

This groups customers by their location field and counts the number of customers in each
V

group.

Example 11: Find total sales for each customer

javascript
Copy code
db.customers.aggregate([
{ $unwind: "$orders" }, // Unwind the orders array
{ $group: { _id: "$firstname", totalSales: { $sum: "$orders.price" } } }
]);

This calculates the total sales amount for each customer based on their orders.

ABHISHEK K N
4.6. Sorting and Projection

Document databases allow sorting results and projecting specific fields.

Example 12: Sort customers by lastname in ascending order

javascript
Copy code
db.customers.find().sort({ lastname: 1 });

ud
Example 13: Retrieve only firstname and email fields

javascript
Copy code
db.customers.find({}, { firstname: 1, email: 1, _id: 0 });

This retrieves only the firstname and email fields, excluding the _id field.
lo
Conclusion

These example queries demonstrate the core operations supported in a document database:
C
1. Insert: Adding new documents.
2. Query: Retrieving documents with conditions.
3. Update: Modifying documents.
tu

4. Delete: Removing documents.

5. Aggregation: Grouping and processing data.

Module 5
V

1. What are Graph databases? Explain with example graph structures.

1. What are Graph Databases?

Graph databases are specialized databases designed to store, manage, and query data
represented as graphs. A graph consists of two primary components:

1. Nodes: Represent entities or objects (like people, products, or locations). Each node
can have properties to describe it.

ABHISHEK K N
2. Edges (Relationships): Represent connections or relationships between nodes. These
edges can also have properties and directionality.

In essence, a graph database is ideal for handling interconnected data and complex
relationships. Traversing relationships in graph databases is highly efficient because
relationships are persisted rather than calculated at query time.

Graph Database Example

ud
Consider the example graph structure shown in Figure 11.1 (as uploaded), where various
entities and their relationships are represented:

lo
C
tu
V

• Nodes (Entities):
o People: Anna, Barbara, Carol, Dawn, Elizabeth, Jill, Martin, and Pramod.
o Books: NoSQL Distilled, Refactoring, Databases, Database Refactoring.
o Company: BigCo.
• Edges (Relationships):
o Friend Relationships: For example, Barbara is a friend of Carol.
o Likes: Barbara likes NoSQL Distilled, and Elizabeth likes Databases.

ABHISHEK K N
o Employee: Anna and Carol are employees of BigCo.
o Author: Martin authored Refactoring, and Pramod authored Database
Refactoring and NoSQL Distilled.
o Category: The book Databases belongs to the "category" of Database
Refactoring.

Key Features of Graph Databases

1. Nodes store data entities (like objects in OOP). Each node has properties (e.g., name:

ud
Martin).
2. Edges represent relationships between nodes, such as likes, friend, or author.
Edges can have directionality and properties (e.g., "since").
3. Traversals: A query on a graph is referred to as a traversal. Traversals efficiently
navigate through nodes and relationships to answer complex queries.

Advantages of Graph Databases

lo
1. Schema Flexibility: New relationships and nodes can be added without changing the
database schema.
2. Performance: Traversing pre-persisted relationships is faster than calculating joins
C
in RDBMS at query time.
3. Dynamic Queries: Traversing requirements can be changed without modifying the
existing data.
tu

Graph Example Queries

1. Get all people employed by BigCo who like NoSQL Distilled:

o Start from the node BigCo, traverse employee relationships to people, and
V

then follow likes relationships to NoSQL Distilled.

2. Find all authors of books liked by Barbara:
o Start from the node Barbara, traverse likes edges to books, and then follow
author edges to authors.

ABHISHEK K N
2. Briefly describe relationships in graph databases, with a neat diagram 10M

Relationships in Graph Databases

In graph databases, relationships are fundamental components that connect nodes (entities)
to form meaningful, traversable graphs. They provide the key advantage of enabling queries
that rely on connections between data points, allowing powerful, flexible models that
closely reflect real-world scenarios.

ud
Key Characteristics of Relationships

1. Directionality:
Relationships are directional, meaning they have:
o Start Node (where the relationship begins).
o End Node (where the relationship points to).

Example: A user can "like" a product, but the product does not "like" the user.

2. Traversability:
lo
o Relationships can be traversed in both directions (e.g., incoming and
outgoing paths).
o This makes it easy to navigate through connected nodes.
C
3. Type:
o Relationships have a type that defines their meaning (e.g., FRIEND, LIKES,
EMPLOYEE_OF).
4. Properties:
tu

o Relationships are first-class citizens in graph databases, meaning they can

have properties like:
▪ Attributes (e.g., since when two users are friends, a timestamp, or a
distance between nodes).
▪ Weight (e.g., the importance of the connection).
5. Flexibility:
V

o New relationship types can be easily added to the graph without restructuring
the entire database.
6. Queries:
o Queries can filter relationships based on properties or types, enabling rich
domain models. For example, we can query:
▪ "Find friends who became friends after 2010."
▪ "Find employees working in the Research role."

Diagram Explanation (Example)

ABHISHEK K N
ud
lo
Below is an example graph diagram representing relationships between nodes, similar to the
figure provided:

Nodes:

1. Entities like "Anna," "Barbara," "BigCo," "Carol," and "Elizabeth."

C
2. Nodes represent real-world entities (e.g., people, companies, products).

Relationships:
tu

1. Employee_of:
o Directional relationship from "Anna," "Barbara," and "Carol" to "BigCo."
o Contains properties such as role (e.g., "Manager," "Research") and hired_date
(e.g., "Mar 06," "Feb 04").
2. Friend:
o Bidirectional friendships exist between nodes like "Anna ↔ Barbara" and
V

"Carol ↔ Barbara."
o Contains properties like since to denote the starting year of the friendship.
3. Share:
o Between "Barbara" and "Elizabeth," showing shared interests like books,
movies, tweets.

ABHISHEK K N
3. Explain scaling and application level sharding of nodes with a neat diagram 10M
Scaling and Application-Level Sharding of Nodes in Graph Databases

Scaling in Graph Databases

Scaling refers to the ability to handle increasing data volume and query demands. Unlike
traditional relational databases, graph databases are relationship-oriented, making scaling
more challenging due to the interconnected nature of nodes.

ud
Challenges in Scaling Graph Databases
1. Sharding Complexity:
o Sharding (splitting data across servers) is difficult because any node can have
relationships with any other node.
o Traversing relationships across servers can cause latency and reduce
performance. lo
2. Memory Management:
o Graph databases benefit from storing relationships and nodes in memory for
faster traversal.
o Modern servers with large RAM allow the entire dataset to fit in memory,
C
ensuring better performance.

Scaling Techniques
tu

1. Vertical Scaling:
o Add more RAM to the server so that nodes and relationships fit entirely in
memory.
o Effective when the dataset is small enough to fit in a single server's memory.
2. Read Scaling Using Replicas:
V

o Master-Slave Replication:
▪ All writes are directed to the master node.
▪ Reads are distributed across multiple slave nodes (read-only replicas).
▪ Proven in systems like MySQL to improve read performance and
availability.
3. Sharding for Large Datasets:

ABHISHEK K N
o When datasets are too large for replication, application-level sharding is
used.
o Nodes are split across multiple servers based on domain-specific knowledge
(e.g., geographical location).
o Example: North America nodes are on one server, and Asia nodes are on
another.

Application-Level Sharding
Application-level sharding involves splitting nodes across servers based on specific criteria

ud
defined at the application level. The application manages the logic for querying data across
these shards.
Example:
1. Nodes related to North America are stored on Server A.
2. Nodes related to Asia are stored on Server B.
lo
3. Relationships between nodes are kept localized to minimize cross-server traversal.

Diagram Explanation
C
tu
V

The diagram below illustrates application-level sharding using two geographical regions:
1. North America (Server A)

ABHISHEK K N
2. Asia (Server B)
Nodes:
• North America: LA, Chicago, NY.
• Asia: Mumbai, Xian, Singapore, Jakarta.
Relationships:
• Supplier, Reseller, Distributor, and Warehouse relationships exist within each
shard.
• Cross-region relationships are minimized to optimize performance.

ud
4. Explain some suitable use cases of graph databases and describe when we should not
use graph databases

Suitable Use Cases of Graph Databases and When Not to Use Them
lo
Suitable Use Cases of Graph Databases

Graph databases are highly effective for scenarios where the relationships between data
points are critical. Here are the most suitable use cases:
C
1. Connected Data

Graph databases are ideal when the data is highly interconnected, and relationships are just
as important as the data itself.
tu

• Social Networks:
o Example: Users (nodes) and their relationships (edges), such as friends,
followers, and likes.
o Use Case: Analyze how people are connected or identify mutual friends (e.g.,
Facebook, LinkedIn).
• Organizational Networks:
V

o Example: Employees, their skills, roles, and project collaborations.

o Use Case: Analyze collaboration patterns within a company.

2. Routing, Dispatch, and Location-Based Services

Graph databases can optimize routing problems where locations and distances are key
factors.

• Logistics and Delivery:

ABHISHEK K N
o
Nodes represent delivery points (addresses), and relationships include
properties like distance or delivery time.
o Use Case: Determine the shortest route for deliveries.
• Location-Based Recommendations:
o Example: Restaurants or stores as nodes.
o Use Case: Recommend nearby services (e.g., “Best restaurant within 2
kilometers”).

3. Recommendation Engines

ud
Graph databases can analyze patterns in relationships to provide personalized
recommendations.

• E-Commerce Recommendations:
o Example: “People who bought this item also bought that item.”
o Use Case: Suggest products based on user behavior.
• Travel Recommendations:
o Example: “People visiting Barcelona often visit Gaudi's landmarks.”
• Fraud Detection:
lo
o Relationships between transactions can be analyzed to find suspicious
patterns.
o Use Case: Detect anomalies such as missing co-purchased items or irregular
transactions.
C
4. Knowledge Graphs

Graph databases are effective for building knowledge graphs to model complex, interrelated
tu

domains.

• Example:
o Nodes: Concepts, events, or entities.
o Edges: Relationships between those concepts (e.g., “Paris is the capital of
France”).
• Use Case: Search engines (e.g., Google Knowledge Graph) for answering user queries
V

using interconnected data.

When Not to Use Graph Databases

While graph databases are powerful, they are not suitable for all scenarios. Below are some
situations where graph databases may not be the best choice:

1. Global Updates to All Nodes

ABHISHEK K N
Graph databases struggle with scenarios where large-scale updates must be performed on all
nodes or relationships.

• Example: Updating a global property (e.g., “Add a discount flag to all products”).
• Issue: Traversing every node can be slow and inefficient.

2. Bulk Data Operations

Graph databases are not optimized for large-scale bulk operations that process entire
datasets.

ud
• Example: Performing aggregations, summations, or bulk updates across millions of
records.
• Alternative: Relational or columnar databases are better suited for these use cases.

3. Data Without Complex Relationships

lo
If the data has minimal relationships or is primarily tabular, graph databases add
unnecessary complexity.

• Example: A simple e-commerce store with customers and orders stored in flat tables.
• Alternative: Use relational databases like MySQL or PostgreSQL for efficient
C
performance.

4. Small Datasets
tu

For datasets with very few nodes and relationships, the benefits of graph databases diminish.

• Example: Small-scale applications with limited connections (e.g., a small product

catalog).
V

ABHISHEK K N

MongoDB (BDSL456B) Manual
No ratings yet
MongoDB (BDSL456B) Manual
31 pages
DF100 - 01 - Introduction To MongoDB and Atlas
No ratings yet
DF100 - 01 - Introduction To MongoDB and Atlas
50 pages
BGD Mod 2 QB Solns
No ratings yet
BGD Mod 2 QB Solns
11 pages
Unit II No-SQL Db Managment
No ratings yet
Unit II No-SQL Db Managment
33 pages
Unit 6
No ratings yet
Unit 6
143 pages
NoSQL Databases
No ratings yet
NoSQL Databases
20 pages
Unit 2 Handouts
No ratings yet
Unit 2 Handouts
11 pages
BIG Data 2
No ratings yet
BIG Data 2
18 pages
Nosqlmodule 1
100% (1)
Nosqlmodule 1
102 pages
NoSQL Unit 1 & 2 QnA
No ratings yet
NoSQL Unit 1 & 2 QnA
18 pages
Unit 2
No ratings yet
Unit 2
65 pages
More Details On Data Models
No ratings yet
More Details On Data Models
23 pages
BIG DATA UNIT-II NOTES
No ratings yet
BIG DATA UNIT-II NOTES
7 pages
Dbms + SQL Sheet (1)
No ratings yet
Dbms + SQL Sheet (1)
78 pages
NoSQL M1
No ratings yet
NoSQL M1
48 pages
Lec 15 Notes
No ratings yet
Lec 15 Notes
3 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
BD Unit 4
No ratings yet
BD Unit 4
45 pages
UNIT 2 - Part1
No ratings yet
UNIT 2 - Part1
53 pages
NoSQL - U1
No ratings yet
NoSQL - U1
8 pages
Features of Nosql: Non-Relational
No ratings yet
Features of Nosql: Non-Relational
7 pages
No SQL Database Compiled
No ratings yet
No SQL Database Compiled
20 pages
NOSQL
No ratings yet
NOSQL
55 pages
NOSQL
No ratings yet
NOSQL
15 pages
Assignment (Data Models of DBMS)
No ratings yet
Assignment (Data Models of DBMS)
5 pages
06.BigDataAndBigDataDesign
No ratings yet
06.BigDataAndBigDataDesign
52 pages
Nosql Tricks
No ratings yet
Nosql Tricks
34 pages
Unit 2
No ratings yet
Unit 2
26 pages
Introduction To Nosql: What Is A Nosql Database Used For?
No ratings yet
Introduction To Nosql: What Is A Nosql Database Used For?
6 pages
DBMS Unit 5 Macro
No ratings yet
DBMS Unit 5 Macro
3 pages
The Relational Model
No ratings yet
The Relational Model
4 pages
Lecture 3.1.2
No ratings yet
Lecture 3.1.2
47 pages
MongoDB Slides Until ClassTest
No ratings yet
MongoDB Slides Until ClassTest
221 pages
Bda CHP 3
No ratings yet
Bda CHP 3
75 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
MODULE 1 -ppt -7B
No ratings yet
MODULE 1 -ppt -7B
70 pages
BDT UNIT-II
No ratings yet
BDT UNIT-II
13 pages
Introduction To Nosql: - Key Value Databases
No ratings yet
Introduction To Nosql: - Key Value Databases
14 pages
Aggregate Data Models Unit 2
No ratings yet
Aggregate Data Models Unit 2
16 pages
No SQL
No ratings yet
No SQL
38 pages
NoSQL Database
No ratings yet
NoSQL Database
8 pages
Intro To Database
No ratings yet
Intro To Database
12 pages
AWS1-1
No ratings yet
AWS1-1
38 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
CH.5 NOSQL database for Business Applications
No ratings yet
CH.5 NOSQL database for Business Applications
21 pages
NoSQL_Notes
No ratings yet
NoSQL_Notes
11 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
Introduction to NoSQL
No ratings yet
Introduction to NoSQL
13 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
Nosql Mqp Solution
No ratings yet
Nosql Mqp Solution
53 pages
2.1.2 Data Models
No ratings yet
2.1.2 Data Models
13 pages
Bda Module-2
No ratings yet
Bda Module-2
32 pages
No SQL
No ratings yet
No SQL
109 pages
Unit 1
No ratings yet
Unit 1
23 pages
UDBMS NOTES
No ratings yet
UDBMS NOTES
18 pages
NOSQL
No ratings yet
NOSQL
25 pages
DBMS Capsule
No ratings yet
DBMS Capsule
4 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
DBMS Unit_III-I[1]
No ratings yet
DBMS Unit_III-I[1]
7 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Database (2)
No ratings yet
Database (2)
72 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Module - 4
No ratings yet
Module - 4
6 pages
OPERATIONS_RESEARCH_Syllabus
No ratings yet
OPERATIONS_RESEARCH_Syllabus
3 pages
OR Module-1 Graphical Method Problems
No ratings yet
OR Module-1 Graphical Method Problems
19 pages
OR Module - 3 Transportation Problems
No ratings yet
OR Module - 3 Transportation Problems
18 pages
MongoDB Roadmap
No ratings yet
MongoDB Roadmap
3 pages
0zI2XrFJX5tR CjuECI f5HwGdQkpL8DAkTmwDPyFm3H0eCERMEvG9fH
No ratings yet
0zI2XrFJX5tR CjuECI f5HwGdQkpL8DAkTmwDPyFm3H0eCERMEvG9fH
13 pages
MAA - OnPremises - Overview ORACLE
No ratings yet
MAA - OnPremises - Overview ORACLE
64 pages
MEAN 3 L4 Advanced MongoDB With Aggregation
No ratings yet
MEAN 3 L4 Advanced MongoDB With Aggregation
94 pages
Foundational Concepts in System Design - Part 1
No ratings yet
Foundational Concepts in System Design - Part 1
29 pages
Using MongoDB For Social Networking Website PDF
No ratings yet
Using MongoDB For Social Networking Website PDF
4 pages
Lecture 06
No ratings yet
Lecture 06
68 pages
Mongo Shard
No ratings yet
Mongo Shard
9 pages
CASE STUDY
No ratings yet
CASE STUDY
6 pages
PingCAP Ebook Modern Distributed Database Fundamentals
No ratings yet
PingCAP Ebook Modern Distributed Database Fundamentals
42 pages
Elasticsearch
No ratings yet
Elasticsearch
15 pages
MongoDB Performance Best Practices
No ratings yet
MongoDB Performance Best Practices
15 pages
MST UNIT-5
No ratings yet
MST UNIT-5
14 pages
Why System Design
0% (1)
Why System Design
229 pages
Big Data Notes
No ratings yet
Big Data Notes
70 pages
MySQL Architecture Design Patterns For Performance, Scalability, and Availability Presentation
No ratings yet
MySQL Architecture Design Patterns For Performance, Scalability, and Availability Presentation
26 pages
Dbms Lab Viva
No ratings yet
Dbms Lab Viva
7 pages
Grokking The System Design Interview
No ratings yet
Grokking The System Design Interview
25 pages
Unit 5
No ratings yet
Unit 5
12 pages
module 2 nosql
No ratings yet
module 2 nosql
31 pages
(Ebook) MongoDB in Action by Kyle Banker ISBN 1935182870 pdf download
100% (1)
(Ebook) MongoDB in Action by Kyle Banker ISBN 1935182870 pdf download
46 pages
Chapter 9 - BDMT
No ratings yet
Chapter 9 - BDMT
61 pages
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
No ratings yet
281511lecture Notes 2 - MongoDB Data Modeling-1718181255820
13 pages
7.1-Amazon Kinesis - Digital Cloud Training
No ratings yet
7.1-Amazon Kinesis - Digital Cloud Training
9 pages
Docs Citusdata Com en v9.2
No ratings yet
Docs Citusdata Com en v9.2
336 pages
NOSQL Lab Book
No ratings yet
NOSQL Lab Book
33 pages
DBMS-Module 5
No ratings yet
DBMS-Module 5
15 pages
AZ-300.examcollection - Premium.exam.32q: Number: AZ-300 Passing Score: 800 Time Limit: 120 Min File Version: 1.0
No ratings yet
AZ-300.examcollection - Premium.exam.32q: Number: AZ-300 Passing Score: 800 Time Limit: 120 Min File Version: 1.0
48 pages