Module 3 and 4

Cryptography

Uploaded by

vvchandrahasreddy7

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Module 3 and 4

Cryptography

Uploaded by

vvchandrahasreddy7

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Module-4

1. Explain Hive Integration and work flow steps involved with a diagram.
2. Describe the Map tasks, Reduce tasks and Map Reduce Execution process.
3. Describe the Hive architecture and its characteristics.
4. Describe the Pig Architecture and features of pig and Applications.
5. Differentiate between Pig and Map Reduce

Module-3

1. Explain about NOSQL data store and its characteristics.

2. Describe the principle of working of the CAP theorem.
3. Demonstrate the working of key-value store with an example.
4. Explain NOSQL Data Architecture Patterns
5. What are the different ways of handling Big Data Problems?
6. Explain Shared Nothing Architecture for Big Data tasks.

2) Describe the Map tasks, Reduce tasks and Map Reduce Execution process

MapReduce is the data processing layer. It processes the huge amount of structured and
unstructured data stored in HDFS.

MapReduce Execution Steps:

MapReduce processess the data in various phases with the help of different components.
Let’s discuss the steps of job execution in Hadoop.
1. Input Files: In input files data for MapReduce job is stored. In HDFS, input files
reside. Input files format is arbitrary. Line-based log files and binary format can also be
used.
2. InputSplits: It represents the data which will be processed by an individual Mapper.
For each split, one map task is created
3. RecordReader: It communicates with the inputSplit. And then converts the data into
key-value pairs suitable for reading by the Mapper. RecordReader by default uses
TextInputFormat to convert data into a key-value pair.
4. Mapper: It processes input record produced by the RecordReader and generates
intermediate keyvalue pairs. The intermediate output is completely different from the
input pair. The output of the mapper is the full collection of key-value pairs. \
5. Combiner: Combiner is Mini-reducer which performs local aggregation on the
mapper’s output. It minimizes the data transfer between mapper and reducer.
6. Shuffling and Sorting: After partitioning, the output is shuffled to the reduce node. The
shuffling is the physical movement of the data which is done over the network.
7. Reducer: Reducer then takes set of intermediate key-value pairs produced by the
mappers as the input. After that runs a reducer function on each of them to generate the
output.
8. Output: OutputFormat defines the way how RecordReader writes these output
key-value pairs in output files. So, its instances provided by the Hadoop write files in
HDFS. Thus OutputFormat instances write the final output of the reducer on HDFS.

2) Describe the Hive architecture and its characteristics.

A)
3) Explain Hive Integration and work flow steps involved with a diagram.
A)
6. Describe the Pig Architecture and features of pig and Applications.

Execution Flow in Pig:

1. Parser: The parser checks for type errors and syntax errors.
2. Optimizer: The DAG is sent to the logical optimizer, where optimization
activities are performed automatically to reduce data flow and improve
efficiency.
3. Compiler: After optimization, the compiler generates a series of MapReduce
jobs that correspond to the logical plan.
4. Execution Engine:
a. The MapReduce jobs are submitted for execution by the execution
engine.
b. The jobs run on the cluster, performing the computations, and output the
final result.
Module - 3
1. What is NoSQL? Explain the CAP Theorem ?
A) NoSQL refers to a class of database management systems that don’t rely on the
traditional relational database model (tables, rows, and SQL for querying).
- NoSQL databases are designed for flexibility, scalability, and
high-performance data storage

CAP Theorem
In distributed systems, the CAP Theorem states that among Consistency (C),
Availability (A), and Partition Tolerance (P), only two can be fully achieved
simultaneously. Here’s a breakdown of these principles:

1. Consistency (C)
Consistency ensures that all copies of the data reflect the same value at any given
time, similar to traditional databases. In distributed databases, consistency means
that:
● All nodes observe the same data simultaneously.
● Changes made in one partition should immediately reflect in other related
partitions and tables using that data.
2. Availability (A):
Availability ensures that the system provides a response to every request, even in the
event of a failure. This means:
● If one partition becomes inactive, other copies of the data in active partitions
remain accessible.
● Distributed systems use replication to maintain availability, ensuring that if one
node fails, another can handle requests.

3. Partition Tolerance (P):

Partition tolerance ensures the system continues functioning even when there is a
network failure or communication breakdown between parts of the system. It involves:
● Dividing a large database into smaller partitions while ensuring they operate
independently without disrupting overall functionality.

2) Explain NOSQL Data Architecture Patterns.

A)
1. Key-Value Store:
● The simplest way to implement a schema-less data store is to use
key-value pairs.
● The data store characteristics are high performance, scalability and
flexibility.

●
● Advantages:
Can handle large amounts of data and heavy load
Easy retrieval of data by keys.
● Examples: • DynamoDB
2. Column Store Database:
● Rather than storing data in relational tuples, the data is stored in
individual cells which are further grouped into columns.
● Column-oriented databases work only on columns.
● Advantages: • Data is readily available
● Examples: • HBase ,Bigtable by Google
3. Document Database:
● The document database fetches and accumulates data in form of
key-value pairs but here, the values are called as Documents.
● Document can be stated as a complex data structure.
● Advantages:
1. This type of format is very useful and apt for semi-structured data.
2. Storage retrieval and managing of documents is easy.
● Examples:
1. MongoDB
2. CouchDB
4. Graph Databases:
● Clearly, this architecture pattern deals with the storage and management
of data in graphs.
● Graphs are basically structures that depict connections between two or
more objects in some data
● Advantages:
1. Fastest traversal because of connections
2. Spatial data can be easily handled.
● Examples:
1. Neo4J
2. FlockDB
6) Explain Shared Nothing Architecture for Big Data tasks.
A) 1) Single Server Model
A single server processes data sequentially

2) Sharding Model

3)Master-Slave Distribution Model

● One master node handles writes and directs slave nodes
4) Peer to Peer Distribution Model

Be Sem 7 Ia 1 Question Bank
No ratings yet
Be Sem 7 Ia 1 Question Bank
4 pages
BDA_answers[1]
No ratings yet
BDA_answers[1]
6 pages
Question Bank - Big Data Analytics - Final1
100% (1)
Question Bank - Big Data Analytics - Final1
6 pages
BgiData QB
No ratings yet
BgiData QB
3 pages
Bda QB
No ratings yet
Bda QB
18 pages
Model question paper _Big data_2024-25_kca022
No ratings yet
Model question paper _Big data_2024-25_kca022
3 pages
MODULE 3&5 21&18
No ratings yet
MODULE 3&5 21&18
26 pages
Question Bank BDA-CCS334
No ratings yet
Question Bank BDA-CCS334
6 pages
Bigdata QB
No ratings yet
Bigdata QB
7 pages
BDA_myimps
No ratings yet
BDA_myimps
4 pages
Big Data Visualization
No ratings yet
Big Data Visualization
55 pages
BDAunit-III
No ratings yet
BDAunit-III
4 pages
General Question Bank
No ratings yet
General Question Bank
5 pages
BDA IA1 QB Solved complete - Copy
No ratings yet
BDA IA1 QB Solved complete - Copy
22 pages
BDAV Question Bank
No ratings yet
BDAV Question Bank
2 pages
DOC-20241202-WA0037.
No ratings yet
DOC-20241202-WA0037.
3 pages
Big Data NOTES
No ratings yet
Big Data NOTES
14 pages
BIG DATA ANALTYTICS QB
No ratings yet
BIG DATA ANALTYTICS QB
3 pages
MODULE 3
No ratings yet
MODULE 3
37 pages
QBII
No ratings yet
QBII
15 pages
Big Data Analysis pdf 2
No ratings yet
Big Data Analysis pdf 2
18 pages
BDA Assignment
No ratings yet
BDA Assignment
2 pages
IAT-IV Question Paper With Solution of 18CS72 Big Data Analytics Feb-2022-Poonam Vijay Tijare
No ratings yet
IAT-IV Question Paper With Solution of 18CS72 Big Data Analytics Feb-2022-Poonam Vijay Tijare
9 pages
Assignment BDHhhh
No ratings yet
Assignment BDHhhh
15 pages
BDA QUESTION BANK
No ratings yet
BDA QUESTION BANK
10 pages
Big Data 2021-2022
No ratings yet
Big Data 2021-2022
18 pages
BDC Previous Papers 2 Marks
100% (1)
BDC Previous Papers 2 Marks
7 pages
Big Data Imp-1
No ratings yet
Big Data Imp-1
16 pages
BDA Assignment QP-3 IT a With Key Solutions
No ratings yet
BDA Assignment QP-3 IT a With Key Solutions
5 pages
big data analytics syallabus
No ratings yet
big data analytics syallabus
3 pages
Question Bank Big Data analytics
No ratings yet
Question Bank Big Data analytics
2 pages
BDA QB3
No ratings yet
BDA QB3
22 pages
21cs71BDA Question bank
No ratings yet
21cs71BDA Question bank
4 pages
Big _Data _ISE 2
No ratings yet
Big _Data _ISE 2
12 pages
DSA Question Bank
No ratings yet
DSA Question Bank
8 pages
Big Data QB
No ratings yet
Big Data QB
5 pages
BDA viva
No ratings yet
BDA viva
26 pages
PE CS801A SampleQB2
No ratings yet
PE CS801A SampleQB2
6 pages
Unit 5
No ratings yet
Unit 5
7 pages
BDA-2
No ratings yet
BDA-2
6 pages
R23-IDS-Unit3-PPT
No ratings yet
R23-IDS-Unit3-PPT
36 pages
Mrcet R20 Iv 1 QB
No ratings yet
Mrcet R20 Iv 1 QB
79 pages
Shortnotes For Cloud
No ratings yet
Shortnotes For Cloud
22 pages
imp
No ratings yet
imp
6 pages
BIGDATA FINAL
No ratings yet
BIGDATA FINAL
25 pages
04
No ratings yet
04
23 pages
Important Questions-Bigdata
No ratings yet
Important Questions-Bigdata
4 pages
19ECS442: BIG DATA Question Bank
No ratings yet
19ECS442: BIG DATA Question Bank
4 pages
Bda Summer 2022 Solution
No ratings yet
Bda Summer 2022 Solution
30 pages
NAAC Accredited ''A" D. E. Society's: "Rainfall in India Analysis"
No ratings yet
NAAC Accredited ''A" D. E. Society's: "Rainfall in India Analysis"
21 pages
Part A & B Big Data Questions
No ratings yet
Part A & B Big Data Questions
5 pages
BDA Final Notes
No ratings yet
BDA Final Notes
53 pages
WWW Doubtly in Big Data Analytics Semester 7 Mu Ai Ds Viva Qna
No ratings yet
WWW Doubtly in Big Data Analytics Semester 7 Mu Ai Ds Viva Qna
7 pages
07-BigData-DataAnalysis
No ratings yet
07-BigData-DataAnalysis
66 pages
IJECEfgfdgfdgfdgfdfgfdgfdgfdgf
No ratings yet
IJECEfgfdgfdgfdgfdfgfdgfdgfdgf
9 pages
BDA question bank
No ratings yet
BDA question bank
8 pages
BDA QN Bank All Units
No ratings yet
BDA QN Bank All Units
5 pages
III-II Big Data Analytics Question Bank
100% (1)
III-II Big Data Analytics Question Bank
3 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Module - 4
No ratings yet
Module - 4
6 pages
Nosql Final
No ratings yet
Nosql Final
50 pages
OPERATIONS_RESEARCH_Syllabus
No ratings yet
OPERATIONS_RESEARCH_Syllabus
3 pages
OR Module-1 Graphical Method Problems
No ratings yet
OR Module-1 Graphical Method Problems
19 pages
OR Module - 3 Transportation Problems
No ratings yet
OR Module - 3 Transportation Problems
18 pages
Declutter Your Home Calendar June 2023
No ratings yet
Declutter Your Home Calendar June 2023
2 pages
ASA-200E Brochure Injecteur PDF
No ratings yet
ASA-200E Brochure Injecteur PDF
2 pages
Table of Contents
No ratings yet
Table of Contents
6 pages
Arduino Laser Tag - Duino Tag
No ratings yet
Arduino Laser Tag - Duino Tag
23 pages
EEE-CCMAS-Student-Handbook
No ratings yet
EEE-CCMAS-Student-Handbook
80 pages
Entity Diagrams
No ratings yet
Entity Diagrams
190 pages
CAT - 3126E.3126E With Prefix HEP
No ratings yet
CAT - 3126E.3126E With Prefix HEP
10 pages
DN CEOL Piccolo Productinfo PDF V3 en 28102014
No ratings yet
DN CEOL Piccolo Productinfo PDF V3 en 28102014
2 pages
Revolution of Artificial Intelligence- Transforming Industries and Society
No ratings yet
Revolution of Artificial Intelligence- Transforming Industries and Society
3 pages
Change Request Template
No ratings yet
Change Request Template
3 pages
Aosong Electronics Co.,Ltd
No ratings yet
Aosong Electronics Co.,Ltd
10 pages
Q1 Lesson-1
No ratings yet
Q1 Lesson-1
19 pages
Lokesh Kumar K Contact Number 9108583151 Contact Number 8050007035 Email
No ratings yet
Lokesh Kumar K Contact Number 9108583151 Contact Number 8050007035 Email
2 pages
Arduino Based Fire Fighting Robot: Submitted by
No ratings yet
Arduino Based Fire Fighting Robot: Submitted by
30 pages
D155a 6 Cen00255-07
No ratings yet
D155a 6 Cen00255-07
16 pages
How To Install MongoDB On Ubuntu 18.04 - DigitalOcean
No ratings yet
How To Install MongoDB On Ubuntu 18.04 - DigitalOcean
1 page
Bajaj Institute of Techhnology, Wardha
No ratings yet
Bajaj Institute of Techhnology, Wardha
15 pages
Panasonic AM100
No ratings yet
Panasonic AM100
8 pages
Product Data Sheet Aluminium Profile EVO EN
No ratings yet
Product Data Sheet Aluminium Profile EVO EN
2 pages
AMAZON - Senior Catman
No ratings yet
AMAZON - Senior Catman
3 pages
Hotel Systems and Guest Cycle Activities
No ratings yet
Hotel Systems and Guest Cycle Activities
4 pages
Openfoam Scilab
No ratings yet
Openfoam Scilab
30 pages
Debugging For Functional Consultants - SAP Blogs
100% (1)
Debugging For Functional Consultants - SAP Blogs
21 pages
Instructions: IR Flame Detector X9800
No ratings yet
Instructions: IR Flame Detector X9800
32 pages
Training AWS - Module 4 - Storage in AWS
No ratings yet
Training AWS - Module 4 - Storage in AWS
48 pages
Salary Guide 2025 - Technology _ Reed
No ratings yet
Salary Guide 2025 - Technology _ Reed
66 pages
Difference Between Role & Designation
No ratings yet
Difference Between Role & Designation
7 pages
A001_Jayesh_Amberkar_ResearchPaper
No ratings yet
A001_Jayesh_Amberkar_ResearchPaper
5 pages
Class X-DBMS PT-1 MCQ
No ratings yet
Class X-DBMS PT-1 MCQ
4 pages
Programming Paradigms
100% (1)
Programming Paradigms
5 pages

Module 3 and 4

Uploaded by

Module 3 and 4

Uploaded by

Module-4

1. Explain about NOSQL data store and its characteristics.

MapReduce Execution Steps:

2) Describe the Hive architecture and its characteristics.

Execution Flow in Pig:

3. Partition Tolerance (P):

2) Explain NOSQL Data Architecture Patterns.

3)Master-Slave Distribution Model

You might also like