0% found this document useful (0 votes)

53 views7 pages

DWH Interview Questions.

Uploaded by

significant.mayur21191

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views7 pages

DWH Interview Questions.

Uploaded by

significant.mayur21191

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Data Warehouse Interview Questions:

 What is Data Warehouse?

A data warehouse is a centralized repository that stores large volumes of structured and
semi-structured data from multiple sources. It is like a relational database designed
specifically for query and analysis, making it a key component in business intelligence and
analytics.
Data warehouses facilitate the use of BI tools, enabling users to create reports, dashboards,
and visualizations for better decision-making.

 What is the basic difference between a Data Warehouse and an Operational Database?
Data Warehouse:
1. Contains historical information which helps analysing business metrics.
2. DW is mainly used to read data.
3. End users are business analyst/ data analyst.
4. Used for analytical processing and reporting (OLAP).

Operational Database:
1. Contains current information that is required to run the business.
2. Database is mainly used to write the data.
3. End users are operational team members.
4. Used for transactional processing (OLTP).

 What is Data Warehousing?

Data Warehousing is the act of organising and storing data in a way so as to make it retrieval
efficient and insightful.
It is also called the process of transforming data into information.

 What is OLAP?
OLAP is a flexible way to make complicated analysis of multidimensional data.
Data present in a data Warehouse is accessed by running OLAP queries. DBs are queried by
running OLTP operations.
OLAP activities are performed by converting the multi-dimensional data in a data warehouse
into a OLAP cube.

 What is OLTP? How is OLAP different from OLTP?

OLTP (Online Transaction Processing) is a class of systems designed to manage and execute
day-to-day transactional operations efficiently. It is commonly used in database
environments that handle high volumes of short, real-time transactions, such as sales,
orders, payments, or other business activities.
Examples: 1. Banking systems for processing payments or withdrawals.
2. Airline reservation systems for booking tickets.
Any system that is absolutely critical for running the business can be categorised as OLTP.
Whereas any system that is used for analysing how the business is running can be
categorised as OLAP.
Characteristics of OLTP:
- Transactional Systems: OLTP systems are designed to handle frequent, small
transactions like inserting, updating, or deleting records.
- High Volume: OLTP systems process a large number of transactions per second,
ensuring fast, real-time updates.
- Data Integrity: OLTP systems prioritize maintaining the accuracy and integrity of data
across multiple users and transactions. This is often achieved through ACID
properties (Atomicity, Consistency, Isolation, Durability).
- Normalized Database: OLTP systems use normalized databases to minimize
redundancy, ensuring data consistency and quick updates.

The key differences between OLTP and OLAP:

- Purpose: OLTP is designed for day-to-day transaction processing, while OLAP is used
for complex queries and data analysis.
- Data Structure: OLTP systems use highly normalized databases (often in 3rd normal
form) to ensure data consistency and minimize redundancy. OLAP systems use
denormalized structures (e.g., star or snowflake schemas) to optimize query
performance.
- Data Volume: OLTP deals with smaller volumes of data from frequent, short
transactions. OLAP handles large volumes of aggregated, historical data for analytical
purposes.
- Query Type: OLTP queries are simple and involve operations like insert, update, and
delete. OLAP queries are complex, involving aggregations, multidimensional analysis,
and often long-running queries.
- Response Time: OLTP systems are optimized for fast response times (milliseconds) to
handle a high volume of transactions. OLAP queries may take longer (seconds to
minutes) as they involve complex computations.
- Users: OLTP supports a large number of concurrent users, typically performing
transactions. OLAP is designed for fewer users, usually analysts or decision-makers,
who perform in-depth data analysis.
- Examples: OLTP systems are used in banking, e-commerce, and retail for managing
day-to-day transactions. OLAP systems are used in business intelligence, financial
reporting, and data warehousing for decision support and trend analysis

 What is a Dimension Table?

- A Dimension Table is a table in a data warehouse that contains descriptive attributes
(or fields) that describe the objects in a fact table. It helps provide context for the
measures or facts in the fact table by offering a more detailed description of the
entities involved in the data. Dimension tables are used in Online Analytical
Processing (OLAP)

 What is a Fact Table?

- A Fact Table is a table containing the measure of the dimensions in a dimension
table.
- Fact is measured by summing, averaging, or manipulating the data in a dimension
table.
- A Fact table contains 2 kinds of data – a dimension key (foreign key) and a measure.

 What is the level of Granularity of a fact table?

- The level of granularity of a fact table refers to the level of detail or specificity at
which the data in the fact table is stored. It defines what each record or row in the
fact table represents. Granularity is a critical design decision in a data warehouse
because it determines the scope of the data that can be analysed and impacts both
the size of the fact table and the detail of the queries that can be run against it.
- The depth of data level is known as granularity.
- A Fact table is usually designed at a low level of Granularity.

 What is the difference between Additive, Semi–additive and Non–additive facts?

- Additive Facts:
- Definition: Additive facts are facts that can be summed (or aggregated) across all
dimensions in the fact table. This is the most common type of fact in data
warehouses, as it can be used in a wide variety of analyses.
- Aggregation: These facts can be aggregated (using SUM, AVG, MIN, MAX, etc.) across
any dimension (e.g., time, location, product, customer).

- Semi-Additive Facts:
- Definition: Semi-additive facts can only be aggregated across some dimensions but
not all. A common limitation is with the time dimension, where aggregation (like
summing) doesn't make sense.
- Aggregation: These facts can be summed or aggregated across some dimensions but
not others (especially time). For example, you can sum semi-additive facts across
product or store, but not over time.

- Non-Additive Facts:
- Definition: Non-additive facts are facts that cannot be summed or aggregated
meaningfully across any dimension. They typically involve metrics like ratios or
percentages, where summing across dimensions would lead to incorrect results.
- Aggregation: These facts do not allow for any meaningful aggregation. You may need
special calculations (like averages or weighted averages) to analyse them across
dimensions.

 What are Conformed dimensions and Conformed facts?

- Conformed Dimensions:
A conformed dimension is a dimension that is shared across multiple fact tables or
data marts within a data warehouse. It is used consistently across different areas of
the business, allowing for unified reporting and analysis across these different fact
tables. The data in a conformed dimension is consistent, meaning the same
definition, structure, and values are used across the warehouse, ensuring that
different parts of the organization are using the same information when analyzing
data.
- Conformed Facts:
A conformed fact refers to facts (measures) that are used consistently across
multiple fact tables or data marts, with the same meaning and calculation method.
This ensures that metrics like sales revenue or profit are calculated the same way
across different reports or analyses, promoting accuracy and comparability in
reporting.

 What are Aggregate tables?

- Aggregate tables are specialized tables in a data warehouse that store pre-
calculated, summarized data. They are created to improve query performance by
reducing the amount of data that needs to be processed for certain types of queries,
especially when large volumes of detailed transactional data are involved. Instead of
querying the detailed fact tables, users can query the aggregate tables for faster
response times, especially when working with reports that don't require granular-
level detail.
- This table reduces the load in the database server and increases the performance of
the query.

 What is Summary information?

- Summary Information is the area in the Data Warehouse where pre-defined
aggregations are kept.
- Can be stored in the form of tables or can be kept in the reporting layer such as
Tableau, Business Objects.

 What is ETL?
- ETL stands for Extract - Transform - Load.
- It is the process of using the software to extract the desired data from various
sources, then transform that data by using rules and lookup tables to meet your
requirement and then loading it into a target data warehouse.

 What are tools available for ETL?

- Informatica PowerCenter
- Talend Studio
- DataStage
- Oracle Warehouse Builder
- Ab Initio
- Data Junction
- SQL Server Integration Services (SSIS)
- SAP Data Services
- Data Migrator (IBI)
- IBM Infosphere Information Server
- Elixir Repertoire for Data ETL
- SAS Data Management
 What is Metadata?
- Metadata is data about data.
- Metadata in a DWH defines the source data i.e Flat File, Relational Database and
other objects.
- Metadata is used to define which table is source and target, and which concept is
used to build business logic called transformation to the actual output.

 What is Data Mining?

 How is it different from data warehousing?
- Data mining is the process of analysing data in different dimensions and summarising
it into useful info. Data is searched, retrieved and analysed from a data warehouse
(or other data storage mechanism) to answer Business Questions.
- Data Warehousing is about storing analytical data in a structure suitable for data
mining. This analytical data is extracted from operational systems usually on daily
basis.

 List the types of OLAP servers?

- MOLAP (Multidimensional OLAP): MOLAP stores data in a multidimensional cube
format, which allows for fast data retrieval. It pre-aggregates and pre-calculates data
in the form of cubes, enabling efficient querying and analysis.

- ROLAP (Relational OLAP): ROLAP stores data in relational databases (RDBMS) and
dynamically generates SQL queries to retrieve data. It doesn't use pre-aggregated
data, so queries take longer than in MOLAP but provide greater flexibility.

- HOLAP (Hybrid OLAP): HOLAP combines the features of both MOLAP and ROLAP. It
uses pre-calculated data cubes for quick access (like MOLAP) and dynamically
queries relational databases for detailed data (like ROLAP), providing a balance
between performance and scalability.

 Which one is faster, Multidimensional OLAP or Relational OLAP?

MOLAP is faster than ROLAP. Since MOLAP is directly stored in memory all the data is pre-
processed.

 What are the operations that can be performed by an OLAP cube? Explain each.0
- Drill-down: The drill-down operation allows users to navigate from summarized
(high-level) data to more detailed (lower-level) data. It increases the level of
granularity in the cube by moving deeper into the hierarchy.

- Roll-up: The drill-up (or roll-up) operation is the opposite of drill-down. It

summarizes detailed data into higher levels of aggregation by reducing granularity.
- Slice: The slice operation selects a specific subset of data from the OLAP cube by
fixing a value for one of the dimensions. It effectively reduces the dimensionality of
the data.

- Dice: The dice operation selects a sub-cube by choosing specific ranges of values
from multiple dimensions. It's similar to the slice operation but applies to multiple
dimensions simultaneously.

- Pivot (Rotate): The pivot operation changes the dimensional orientation of the data
to provide a different view of it. It involves rotating the cube to see data from
different perspectives by swapping rows and columns.

 What is normalization? What is the benefit of Normalisation?

- Normalization is a database design process that organizes data in a way that reduces
redundancy and dependency by dividing large tables into smaller, related tables. This
process helps in establishing relationships between the tables through the use of
foreign keys. The main objective of normalization is to eliminate data anomalies and
ensure data integrity.

- Benefits of Normalization:
1. Reduced Data Redundancy: Normalization minimizes the amount of
duplicated data within the database, leading to efficient storage.
2. Improved Data Integrity: By enforcing rules for data relationships and
dependencies, normalization helps maintain data accuracy and consistency.
3. Efficient Data Retrieval: Normalized databases can result in faster queries, as
the data is structured logically and can be easily accessed through
relationships.
4. Simplified Database Maintenance: With reduced redundancy and clear
relationships, database maintenance tasks (like updates and deletions)
become more manageable.
5. Flexibility for Changes: Normalized structures allow for easier modification
of the database schema. Adding new data or adjusting relationships can be
done without significant changes to the overall design.
6. Enhanced Security: By separating sensitive data into distinct tables and
controlling access to those tables, normalization can enhance data security.

Informatica Bhaskar20161012
No ratings yet
Informatica Bhaskar20161012
90 pages
Dwbi Notes
No ratings yet
Dwbi Notes
32 pages
STADVDB Slides 02 - Summarizing Volumes of Data
No ratings yet
STADVDB Slides 02 - Summarizing Volumes of Data
38 pages
DWH Concepts Interview Q&A
No ratings yet
DWH Concepts Interview Q&A
12 pages
Unit-1 4
No ratings yet
Unit-1 4
54 pages
Data Warehousing Interview Q&A
No ratings yet
Data Warehousing Interview Q&A
14 pages
DWM Unit 1 (2023)
No ratings yet
DWM Unit 1 (2023)
38 pages
OBIEE - Quick Guide
No ratings yet
OBIEE - Quick Guide
78 pages
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
No ratings yet
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
32 pages
DWDM Set-2
No ratings yet
DWDM Set-2
55 pages
ETL Testing Concepts V16
No ratings yet
ETL Testing Concepts V16
35 pages
Dwbi Notes
No ratings yet
Dwbi Notes
26 pages
DW Basics
No ratings yet
DW Basics
24 pages
Data Warehousing Interview Questions
No ratings yet
Data Warehousing Interview Questions
6 pages
CST466-M1 - Ktunotes - in
No ratings yet
CST466-M1 - Ktunotes - in
24 pages
Data Warehousin G Concepts
No ratings yet
Data Warehousin G Concepts
41 pages
Ccs341-Dw-Int I Key-Set Ii - Ar
No ratings yet
Ccs341-Dw-Int I Key-Set Ii - Ar
14 pages
Chapter 2.introduction To Data Warehouse
No ratings yet
Chapter 2.introduction To Data Warehouse
49 pages
DWDM Unit-2 Final
No ratings yet
DWDM Unit-2 Final
21 pages
Unit 2 DATA WAREHOUSE AND DATA MART
No ratings yet
Unit 2 DATA WAREHOUSE AND DATA MART
17 pages
DWDM Unit 2
No ratings yet
DWDM Unit 2
21 pages
Data Warehousing Interview Questions and Answers
No ratings yet
Data Warehousing Interview Questions and Answers
6 pages
Business Intelligence Interview Questions and Answer
No ratings yet
Business Intelligence Interview Questions and Answer
12 pages
2.data Warehouse and OLAP
No ratings yet
2.data Warehouse and OLAP
14 pages
ODI Class Notes
50% (2)
ODI Class Notes
149 pages
5.data Warehousing Interview Questions
No ratings yet
5.data Warehousing Interview Questions
4 pages
Cs 614
No ratings yet
Cs 614
10 pages
DWDM Mid 1
No ratings yet
DWDM Mid 1
10 pages
Cs 614
No ratings yet
Cs 614
11 pages
CTEVT Data Mining - Solution 2079
No ratings yet
CTEVT Data Mining - Solution 2079
19 pages
DW Notes
No ratings yet
DW Notes
13 pages
Data Warehouse Concepts & Terminology: - Vamshi Myana
No ratings yet
Data Warehouse Concepts & Terminology: - Vamshi Myana
39 pages
DWH Meterial
No ratings yet
DWH Meterial
9 pages
ETL Testing_concepts_V24 (1)
No ratings yet
ETL Testing_concepts_V24 (1)
60 pages
Data Stage
No ratings yet
Data Stage
10 pages
Data Warehousing Unit 1,2
No ratings yet
Data Warehousing Unit 1,2
9 pages
Data Warehousing
No ratings yet
Data Warehousing
7 pages
DWDM QUESTIONS
No ratings yet
DWDM QUESTIONS
8 pages
Unit - 3 Data Warehousing and OLAP Technology
No ratings yet
Unit - 3 Data Warehousing and OLAP Technology
20 pages
DW-2 Marks
No ratings yet
DW-2 Marks
11 pages
DataWarehouseInterview Part1
No ratings yet
DataWarehouseInterview Part1
4 pages
DW
No ratings yet
DW
29 pages
Datawarehouse Interview Quesion and Answers
100% (1)
Datawarehouse Interview Quesion and Answers
230 pages
SQL SERVER - Data Warehousing Interview Questions and Answers Part 1 PDF
No ratings yet
SQL SERVER - Data Warehousing Interview Questions and Answers Part 1 PDF
3 pages
DWM - Viva and Short Question Answers
No ratings yet
DWM - Viva and Short Question Answers
24 pages
Itab Lec
No ratings yet
Itab Lec
14 pages
Real-Time Business Intelligence
100% (1)
Real-Time Business Intelligence
75 pages
Model Test Question Paper
No ratings yet
Model Test Question Paper
4 pages
Concept Hierarchy in Data Mining: Specication, Generation and Implementation
No ratings yet
Concept Hierarchy in Data Mining: Specication, Generation and Implementation
116 pages
Data Warehousing: People Making Technology Wor K™
100% (1)
Data Warehousing: People Making Technology Wor K™
44 pages
Predictive Data Analytics With Python
100% (2)
Predictive Data Analytics With Python
97 pages
DataWarehousing Interview QuestionsandAnswers
100% (8)
DataWarehousing Interview QuestionsandAnswers
9 pages
Biw Basics
100% (1)
Biw Basics
109 pages
R20 M.Tech DS
No ratings yet
R20 M.Tech DS
64 pages
Unit 4 - Ecommerce Data Models
No ratings yet
Unit 4 - Ecommerce Data Models
40 pages
Chapter 1
No ratings yet
Chapter 1
55 pages
What's A Data Warehouse?
No ratings yet
What's A Data Warehouse?
41 pages
Handbook For Technical Recruitment
No ratings yet
Handbook For Technical Recruitment
18 pages
Database Systems: A Pragmatic Approach Third Edition Elvis C. Foster PDF Download
No ratings yet
Database Systems: A Pragmatic Approach Third Edition Elvis C. Foster PDF Download
55 pages
openSAP Sac3 Week 1 Transcript
No ratings yet
openSAP Sac3 Week 1 Transcript
29 pages
Module 2 Review Questions PDF
No ratings yet
Module 2 Review Questions PDF
2 pages
Fdsa Unit 1 Aids Sem 4
No ratings yet
Fdsa Unit 1 Aids Sem 4
26 pages
IBM Business Process Manager Version 8.0 Production Topologies
No ratings yet
IBM Business Process Manager Version 8.0 Production Topologies
474 pages
Data Warehouse Data Mining Lecture Plan
No ratings yet
Data Warehouse Data Mining Lecture Plan
1 page
Lecture Notes - Introduction To Information Systems (PDF) - Course Sidekick
No ratings yet
Lecture Notes - Introduction To Information Systems (PDF) - Course Sidekick
3 pages
03a Data Warehouse Intro
No ratings yet
03a Data Warehouse Intro
86 pages
Data Warehousing Thesis Topics
100% (3)
Data Warehousing Thesis Topics
7 pages
Business Intelligence Content
No ratings yet
Business Intelligence Content
118 pages
Answers To Talend Interview Questions
No ratings yet
Answers To Talend Interview Questions
6 pages
Assignment 2
No ratings yet
Assignment 2
1 page
Association Analysis
No ratings yet
Association Analysis
3 pages
DW Basic Questions
No ratings yet
DW Basic Questions
9 pages
Data Warehouse Ques
No ratings yet
Data Warehouse Ques
10 pages
What's A Data Warehouse
No ratings yet
What's A Data Warehouse
24 pages
2.what Are Fundamental Stages of Data Warehousing?: Wikipedia
No ratings yet
2.what Are Fundamental Stages of Data Warehousing?: Wikipedia
7 pages
IBM Industry Models
0% (1)
IBM Industry Models
2 pages
Dimensional Modeling
100% (1)
Dimensional Modeling
670 pages
SQL SERVER - Data Warehousing Interview Questions and Answers - Part 1
No ratings yet
SQL SERVER - Data Warehousing Interview Questions and Answers - Part 1
7 pages
Paper IEEE WSCAR-Final PDF
No ratings yet
Paper IEEE WSCAR-Final PDF
6 pages
2.data Warehousing: Heterogeneous Database Integration
No ratings yet
2.data Warehousing: Heterogeneous Database Integration
26 pages
Business Intelligence & Data Warehouse-Course Outline
No ratings yet
Business Intelligence & Data Warehouse-Course Outline
3 pages
Pinaldave: From: Answers-Part-1
No ratings yet
Pinaldave: From: Answers-Part-1
2 pages

DWH Interview Questions.

Uploaded by

DWH Interview Questions.

Uploaded by

Data Warehouse Interview Questions:

 What is Data Warehouse?

 What is Data Warehousing?

 What is OLTP? How is OLAP different from OLTP?

The key differences between OLTP and OLAP:

 What is a Dimension Table?

 What is a Fact Table?

 What is the level of Granularity of a fact table?

 What is the difference between Additive, Semi–additive and Non–additive facts?

 What are Conformed dimensions and Conformed facts?

 What are Aggregate tables?

 What is Summary information?

 What are tools available for ETL?

 What is Data Mining?

 List the types of OLAP servers?

 Which one is faster, Multidimensional OLAP or Relational OLAP?

- Roll-up: The drill-up (or roll-up) operation is the opposite of drill-down. It

 What is normalization? What is the benefit of Normalisation?

You might also like