0% found this document useful (0 votes)
183 views4 pages

BDA Question Bank - 2023

This document contains study material and question bank for the subject Big Data Analytics. It is divided into 5 modules. The modules cover topics related to evolution of big data, Hadoop ecosystem, NoSQL data stores, Hive and Pig, machine learning and social network analysis. Each module lists short answer and long answer questions related to the topics covered in that module. The questions are for different university examinations held in February/March and July/August.

Uploaded by

aslankn123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
183 views4 pages

BDA Question Bank - 2023

This document contains study material and question bank for the subject Big Data Analytics. It is divided into 5 modules. The modules cover topics related to evolution of big data, Hadoop ecosystem, NoSQL data stores, Hive and Pig, machine learning and social network analysis. Each module lists short answer and long answer questions related to the topics covered in that module. The questions are for different university examinations held in February/March and July/August.

Uploaded by

aslankn123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

|| Jai Sri Gurudev |

Sri Adichunchanagiri Shikshana Trust (R)

SJB INSTITUTE OF TECHNOLOGY

Study material
Question Bank
Subject Name: Big Data Analytics
Subject Code: 18CS72
By

Faculty Name: Dr. Pavitra Bai S


Designation: Associate Professor
Semester: VII

Department of Information Science & Engineering


Aca. Year: Odd Sem /2023-24
Module 1
1.Discuss the Evolution of Big Data (06 Marks) (Feb/March 2022)
2.Explain the characteristics of Big Data (04 Marks) (Feb/March 2022)
3. With a neat block diagram, explain Data Architecture Design (10 Marks) (Feb/March 2022)
4. Write notes on Scalability to Big Data and Massive Parallel Processing Platforms.
(12 Marks) (Feb/March 2022)
5. Highlight big data analytics with one case study (08 Marks) (Feb/March 2022)
6. Define Data, Web data, Big data. Also explain structured, semi structured and unstructured
data. (10 Marks) (July/August 2022)
7. List and explain the characteristics of big data. Illustrate by considering an example of related
E-commerce, how big data is used. (10 Marks) (July/August 2022)
8. With a neat diagram, explain the function of each of the five layers in big data architecture
design. (12 Marks) (July/August 2022)
9. How does Berkeley Data Analytics stack help in analytics tasks? (08 Marks) (July/August 2022)

Module 2
1.What are the core components of Hadoop? Explain in brief 1ts each of its components.
(10 Marks) (Feb/March 2022)
2. Explain HDFS (10 Marks) (Feb/March 2022)
3.Define MapReduce Framework and its functions (06 Marks) (Feb/March 2022)
4.Write down the steps of the request to MapReduce and the types of process in MapReduce.
(10 Marks) (Feb/March 2022)
5. Write short note on “Flume Hadoop Tool” (04 Marks) (Feb/March 2022)
6.With a neat diagram, explain Hadoop main components and ecosystem components.
(08 Marks) (July/August 2022)
7. Brief out the features of Hadoop HDFS? Also explain the functions of Name Node and Data
Node. (08 Marks) (July/August 2022)
8. Explain any two HDFS commands with example. (04 Marks) (July/August 2022)
9. Explain the following: (12 Marks) (July/August 2022)
(i)HDFS block replication (ii) HDFS safe mode. iii)Rack awareness (iv) Name Node high
availability.
10.Discuss the Apache Sqoop Import and Export methods with neat diagrams.
(08 Marks) (July/August 2022)

Module 3
1.Discuss the characteristics of NoSQL data stores along with the features in NoSQL
transactions. (08 Marks) (Feb/March 2022)
2.with neat diagram explain shared nothing architectures
i) Single server model
ii)Sharding very large databases
iii) Master Slave distribution model
iv) peer distribution model . (12 Marks) (Feb/March 2022)
3.Define key-value store with example. What are the advantages of key-value store?
(10 Marks) (Feb/March 2022)
4.Write down the steps to provide client to read and write values using key- value store. What
are the typical uses of key value store? (10 Marks) (Feb/March 2022)
5. List and compare the features of Big Table, RC, ORC and Parquet data stores.
(10 Marks) (July/August 2022)
6. With example explain key-value store. (10 Marks) (July/August 2022)
7. Discuss the usage of MongoDB, Cassandra, CouchDB, Oracle NoSQL and Riak.
(10 Marks) (July/August 2022)
8. List the Pros and Cons of distribution using sharding. (05 Marks) (July/August 2022)
9. Give the comparison between NoSQL and SQL/RDBMS. (05 Marks) (July/August 2022)

Module 4
1.Define key-value store with example. What are the advantages of key-value store?
(10 Marks) (Feb/March 2022)
2.Write down the steps to provide client to read and write values using key- value store. What
are the typical uses of key value store? (10 Marks) (Feb/March 2022)
3.Using HiveQL for the following. (10 Marks) (Feb/March 2022)
(i) Create a table with partition. (ii) Add, rename and drop a partition to a table
4.What is PIG in Big Data? Explain the features of PIG (10 Marks) (Feb/March 2022)
5. Describe MapReduce Execution steps with a neat sketch. (12 Marks) (July/August 2022)
6. How node failure can be handled in Hadoop? Discuss. (08 Marks) (July/August 2022)
7. With a neat diagram, describe Hive integration and workflow steps.
(10 Marks) (July/August 2022)
8. Explain with Return type and Syntax the Hive built-in functions. (10 Marks) (July/August 2022)

Module 5
1.In Machine Learning explain linear and nonlinear relationship with essential graphs.
(10 Marks) (Feb/March 2022)
2.Write the block diagram of text mining and its phases (10 Marks) (Feb/March 2022)
3.Define multiple regressions and write down the examples involved in forecasting and
optimization regressions. (10 Marks) (Feb/March 2022)
4.Explain the parameters social graph network topological analysis using centralities and
PageRank (10 Marks) (Feb/March 2022)
5. Discuss Regression Analysis using Linear and Non-linear regression models.
(10 Marks) (July/August 2022)
6. Explain with an example Apriori algorithm to evaluate candidate key.
(10 Marks) (July/August 2022)
7. Write a note on: (i) Web mining (ii) Web content mining. (iii)Web usage mining.
(12 Marks) (July/August 2022)
8. How the Cliques discover communities from social network analysis?
(04 Marks) (July/August 2022)
9. Define a Page Rank. (04 Marks) (July/August 2022)

You might also like