BDA Questions
BDA Questions
2. Explain Page Rank with Example. Can a website page rank Ever Increase? What are its
chances of Decreasing?
3. Explain Hubs and Authorities with neat diagram.
4. With respect to data stream querying, give example of
a. One Time queries
b. Continuous Queries
c. Pre-defined queries
d. Ad-hoc queries
5. Explain Hadoop Ecosystem with core components, Explain its Physical architecture. State
Limitations of Hadoop.
6. What is MapReduce? Explain how Map and Reduce Work? What is Shuffling in MapReduce?
7. How would you get the features of the document in a content-based system? Explain
document similarity.
8. What is triangular matrix? How it is used for main memory counting?
9. Explain Collaborative Filtering based recommendation system. How it is different from
content-based recommendation systems?
10. What are combiners? When should one use combiner in MapReduce job?
11. How to count distinct elements in a stream? Explain Flajolet-Martin Algorithm.
12. What is NoSQL? What are the business drivers for NoSQL? Discuss any two architectural
patterns of NoSQL.
13. What is Data Stream Management System? Explain with Block Diagram.
14. Discuss any five characteristics of Big Data.
15. Describe the structure of HDFS in a Hadoop Ecosystem using a diagram.
16. Define Social networks and Social Network Mining.
17. Explain Hamming distance measure with an example.
18. Describe characteristics of a NoSQL database.
19. Explain concept of Map Reduce using an example. Write Map Reduce pseudocode for
“Group By” “aggregation” in a database.
20. Explain the concept of Bloom filter using an example.
21. Explain any one algorithm to count number of distinct elements in a Data stream.
22. Draw the diagram showing the structure of WWW and explain different parts.
23. What are recommendation systems? Clearly explain two applications for Recommendation
Systems.
24. Explain in detail any one Ranking algorithm used by search engines.
25. Explain with diagrams the Park Chen Yu (PCY) algorithm for frequent itemset mining.
26. What is a community in a Social Network Graph? Explain any one algorithm for finding
communities in a social graph.
27. Explain what characteristics of Social Networks make it Big Data.
28. What do you mean by Jaccard Similarity? Illustrate with an example. Describe two
applications that can use Jaccard Similarity.
29. What are the challenges of querying on large Data Streams.
30. What do you understand by BASE properties in NoSQL Database? Explain in detail any one
NoSQL architecture pattern. Identify two applications that can use this pattern.
31. Write Map Reduce Pseudocode to multiply two matrices.
32. Describe any three components of typical Hadoop Ecosystem.
33. Explain CURE algorithm for clustering large datasets. Please illustrate the algorithm using
appropriate figures.
34. Compare Big data analytics with traditional data mining and warehousing system.
35. How Big Data Analytics can be useful in development of Digital India.
36. What are distance measures? Brief any two distance measures.
37. List down the steps in HITS Algorithm with one example.
38. Differentiate between RDBMS and NoSQL Database.
39. How recommendation is done based on properties of product. Explain with suitable
example.
40. Explain Girvan-Newman algorithm to mine social graphs
41. How big data problems are handled by Hadoop system.
42. List down the steps in Modified Page Rank Algorithm to avoid spider trap with one example.
43. Explain DGIM algorithm for counting ones in stream with example.