0% found this document useful (0 votes)
6 views3 pages

21CA3101 Unit II QB

The document is a question bank for the Big Data Analytics course at Francis Xavier Engineering College for the academic year 2024-25. It outlines the distribution of questions across different parts and topics, focusing on mining data streams and real-time analytics applications. The question bank includes various types of questions aimed at assessing students' understanding of stream data models, filtering streams, and real-time analytics techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

21CA3101 Unit II QB

The document is a question bank for the Big Data Analytics course at Francis Xavier Engineering College for the academic year 2024-25. It outlines the distribution of questions across different parts and topics, focusing on mining data streams and real-time analytics applications. The question bank includes various types of questions aimed at assessing students' understanding of stream data models, filtering streams, and real-time analytics techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

FRANCIS XAVIER ENGINEERING COLLEGE, TIRUNELVELI

(An Autonomous Institution)


DEPARTMENT OF MCA
ACADEMIC YEAR: 2024-25/ODD BATCH: 2023 – 25 SEM: 03
Course Code | Name: 21CA3101 – BIGDATA ANALYTICS
QUESTION BANK
Description of part No. of questions needed
Part – A 8
Part – B 8
Part – C 2
TOTAL 18
Unit II – MINING DATA STREAMS
Introduction to Streams Concepts – Stream Data Model and Architecture - Stream Computing
- Sampling Data in a Stream – Filtering Streams – Counting Distinct Elements in a Stream –
Estimating Moments – Counting Oneness in a Window – Decaying Window - Real time
Analytics Platform(RTAP) Applications –- Case Studies - Real Time Sentiment Analysis, Stock
Market Predictions.
Course Outcome – CO2: Design efficient algorithms for mining the data from large volumes
Important topics *
 Topic 1 – Introduction to Streams Concepts – Stream Data Model and Architecture
 Topic 2 – Stream Computing - Sampling Data in a Stream
 Topic 3 – Filtering Streams1
 Topic 4 - Filtering Streams2
 Topic 5 - Real time Analytics Platform(RTAP)Applications
 Distribution of questions based on important topics*
No. of topics in Part A Part B Part C
an unit (in each topic) (in each topic) (in each topic)
1 3 1
2 3 1
3 3 3 1
4 3 2
5 3 2 1
PART – A (2 marks)
Max.
Q. No Question Marks
Topic CO BL KC PI
Depict the Architecture of Data Stream
1. 02 T1 1 K2 C PO3 -3.1.1
Management System
The output of the Data Stream Management
2. 02 T1 1 K3 P PO2 -2.1.2
System in Knowledge. Prove it
3. How data can be stored in a repository? 02 T1 1 K2 P PO3 -3.1.1
In a YouTube channel with 500K Subscribers,
4. how the number of views per day can be 02 T2 1 K3 P PO2 -2.1.1
predicted?
Figure out the type of data that will be
5. 02 T2 1 K2 P PO2 -2.1.1
obtained from Satellite
6. Compare Standing query and Stream Query 02 T2 1 K2 P PO3 -3.1.1

How a random sample can be obtained from a


7. 02 T3 1 K3 MC PO2 -2.1.1
huge amount of data?
Differentiate filtering of streams with Stream
8. 02 T3 1 K2 P PO1-1.3.1
Query
Capturing a moment at regular intervals and
9. recording produce huge amount of data. How 02 T3 1 K3 MC PO2 -2.1.1
do a moment capturing is estimated?
10. Depict the decaying window 02 T4 1 K2 C PO2 -2.1.2

How to filter a sample of data from a huge


11. 02 T4 1 K3 MC PO1-2.1.2
data?
The higher the moment, the harder it is to
estimate, in order to obtain quality data.
12. 02 T4 1 K3 MC PO1-1.3.1
Sketch out a solution for the above said
scenario
Summarize the need of on demand real time
13. analytics when a bank faces fraud payment 02 T5 1 K2 P PO1-1.3.1
and money laundering.
Does sentiment analysis rely on opinion
14. mining? 02 T5 1 K2 P PO2 -2.1.1

Give the key factors that predict the profit in


15. 02 T5 1 K2 P PO2 -2.1.1
Stock market exchange.
PART – B (13 marks)
Max.
Q. No Question Topic CO BL KC PO
Marks
Explore the sources in which the data can be
1 generated. Give the formats that are obtained 13 T1 1 K2 P PO1-2.1.2
from a source within a stipulated time.
Web sites often like to report the number of
unique users over the past month. If we think of
2 each login as a stream element, we can maintain a 13 T2 1 K3 MC PO2-2.2.3
window that is all logins in the most recent
month. Discuss a relevant Stream query to obtain
the user in the past month
Imagine a stream contains videos and images
3 altogether, explain how the distinct elements can 13 T3 1 K3 MC PO2-2.2.3
be counted?
4 How estimation of moments carried out for a 13 T3 1 K3 MC PO2-2.2.3
period of time? Justify 0th moments calculates
distinct elements. Narrate it with a neat example
Write C Coding to traverse the given array
5 considering every window of size K in it and 13 T3 1 K3 MC PO2-2.2.3
keeping a count on the distinct elements of the
window using Naïve approach
Stress the importance of Filtering of Stream,
6 when a source sends data of different data types 13 T4 1 K3 MC PO2-2.2.3
to a cloud storage
If a stream has n elements, of which m are
7 distinct, what are the minimum and maximum 13 T4 1 K3 MC PO2-2.2.3
possible surprise number, as a function of m and
n?
Suppose we have a window of length N on a
8 binary stream. What are the techniques that can 13 T5 1 K2 P PO1-2.1.2
be used to answer queries of the form “how many
1’s are there in the last k bits?” for any k ≤ N.
How Stochastic variables involved in predicting
9 the real time stock market exchange? Explain in 13 T5 1 K2 P PO3.34.3
detail
PART – (15 marks)
Max.
Q. No Question Topic CO BL KC PO
Marks
A search engine receives a stream of queries,
and it would like to study the behaviour of
typical users. We assume the stream consists
1 of tuples (user, query, time). Suppose that we 15 T3 1 K3 MC PO3-3.1.5
want to answer queries such as “What fraction
of the typical user’s queries were repeated
over the past month?” What kind of filtering
techniques can be applied?
Twitter has become a central site where
2 people express their opinions and views on 15 T5 1 K3 MC PO3-3.1.5
political parties and candidates. How can we
explore the events that affect the public?

Part A Part B Part C


Topic No. of Total No. of Total No. of Total
Questions Marks Questions Marks Questions Marks
Topic 1 – T1 3 6 1 13

Topic 2 – T2 3 6 1 13

3 3 1
Topic 3 – T3 6 39 15

Topic 4 – T4 3 6 2 26

Topic 5 – T5 3 6 2 26 1 15

TOTAL 15 30 9 117 2 30

You might also like