0% found this document useful (0 votes)
52 views

Siddaganga Institute of Technology, Tumakuru - 572 103: Usn 1 S I CSPE34

1) The data generated from a GPS satellite and web logs is classified as big data. Big data is characterized by its volume, variety, and velocity. The major parallel computing platforms are Hadoop and Spark. Hadoop is an open source framework that enables storing large volumes of data in a distributed manner across machines. 2) In Hive, the term 'aggregation' is used to group data. A MapReduce Oozie action effectively creates two MapReduce jobs: map and reduce. MongoDB is an example of a non-relational database. If the mapper output does not match the reducer input, the job will fail. 3) Cloud deployment models include public, private, hybrid, and community clouds. Block
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Siddaganga Institute of Technology, Tumakuru - 572 103: Usn 1 S I CSPE34

1) The data generated from a GPS satellite and web logs is classified as big data. Big data is characterized by its volume, variety, and velocity. The major parallel computing platforms are Hadoop and Spark. Hadoop is an open source framework that enables storing large volumes of data in a distributed manner across machines. 2) In Hive, the term 'aggregation' is used to group data. A MapReduce Oozie action effectively creates two MapReduce jobs: map and reduce. MongoDB is an example of a non-relational database. If the mapper output does not match the reducer input, the job will fail. 3) Cloud deployment models include public, private, hybrid, and community clouds. Block
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

USN 1 S I CSPE34

Siddaganga Institute of Technology, Tumakuru – 572 103


(An Autonomous Institution affiliated to VTU, Belagavi, Approved by AICTE, New Delhi)

Seventh Semester B.E. Computer Science & Engg. Examinations Dec. 2017
Big Data
Time: 3 Hours Max. Marks: 100
Note : 1. Question No. 1 is Compulsory
2. Answer any 4 full questions from question No. 2 to Question No. 6

1 a) The data generated from a GPS satellite and web logs is classified as ________.
b) The data being captured can be in any form or structure. Which characteristic of Big data are
we talking about?
c) Any 2 major parallel computing platforms are ______ and _______.
d) ______ is an open source frame work that enables you to store large volumes of data in a
distributed manner across multiple machines.
e) Technology that manages traffic between virtual and physical machines is known as ______.
f) Hive uses _______ to store metadata.
g) In Hive, the term ‘aggregation’ is used to _______.
h) A MapReduce Oozie action effectively creates two MapReduce jobs. They are ______ and______.
i) ______ is an example of a non-relational database.
j) What happens if the mapper output does not match the reducer input? 1  10
k) Briefly define any two types of financial frauds.
ℓ) Mention any 4 cloud deployment models.
m) Mention any 2 functions of block servers.
n) Define Brewer’s theorem.
o) Explain different types of MapReduce application. 25
2 a) List the steps that SNA follows to detect fraud. 6
b) List the difference between parallel and distributed systems. 6
c) What is distributed computing? Explain the working of a distributed computing environment. 8
3 a) Explain the virtualization process of different elements the Big data environment. How you
are managing virtualization with Hypervisor. 10
b) With neat diagram describe the architecture of HDFS. 10
4 a) List and Explain the key building blocks of the Hadoop platform management layer. 6
b) Describe the points you need to consider while designing a file system in MapReduce. 6
c) What is non-relational database? Explain the characteristics of non-relational database. 8
5 a) Can MapReduce be used to solve any kind of computational problem? If not explain the cases
where MapReduce is not applicable. 6
b) Describe the role of combiner in MapReduce processing. 6
c) List any eight guidelines while implementing MapReduce applications. 8
6 a) Mention any 6 Hive services. 6
b) Write Hive commands for:
i. Creating a data base named Super- Market.
ii. Creating 2 tables with the given properties
Customer: Cust-name char Items: Item-name char
Cust-avg-Exp num Item-price num
iii. Write a script to copy this table structure into a new table. 6
c) List and explain the key concepts of the Oozie coordinator. 8
________

You might also like