CTT339 BIG DATA MID-TERM
TH2014/02 EXAM
Examinee: Student ID:
INSTRUCTION
• You have 60 minutes to answer 10 questions and 2 extra questions.
• Open-book exam, electronic devices with no Internet connection are allowed.
• Give your answers and show your work in the space provided.
• Be aware that redundant answers may give you minus points!!!
QUESTIONS
1. Define what Big Data is by using your words. [1pt]
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
2. The characteristics of Big Data are mainly characterized by 3V’s. What are 3V’s? [1pt]
V___________________, V_________________, and V___________________
3. Hadoop is considered an alternative choice for traditional RDBMS and data warehouses. Is it
TRUE or FALSE? Briefly explain your choice. [1pt]
Your choice is _________________.
Your reason:
_________________________________________________________________________________
_________________________________________________________________________________
4. What are the main improvements of HDFS in Hadoop 2.x over Hadoop 1.x? Name the
technologies and briefly describe them. [1pt]
________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
Trang 1/3
5. Hadoop 2.x allows the use of multiple Standby NameNodes. Is it TRUE or FALSE? Briefly
explain your choice. [1pt]
Your choice is _________________.
Your reason:
_________________________________________________________________________________
_________________________________________________________________________________
6. What are the main differences between Secondary NameNode and Standby NameNode? [1pt]
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
7. Assume that you need to analyze the large amount of data streaming continuously from a
spacecraft to the ground station. Is Hadoop MapReduce a good solution? Briefly explain your
choice. In case that MapReduce is not suitable, suggest another solution. [1pt]
Your choice is _________________.
Your reason:
_________________________________________________________________________________
_________________________________________________________________________________
Your suggestion (if necessary): ___________________________________________________
8. What is Single Point of Failure (SPOF)? [1pt]
_________________________________________________________________________________
_________________________________________________________________________________
9. How do you equip NameNode to prevent different types of failures? [1pt]
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
Trang 2/3
10. Describe the process of checkpointing in general. [1pt]
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
11. What is the default value for the replication factor of blocks? How does HDFS arrange these
replicas in the cluster? [+1pt]
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
_________________________________________________________________________________
12. What is Hadoop Summit? [+1pt]
_________________________________________________________________________________
_________________________________________________________________________________
Trang 3/3