0% found this document useful (0 votes)
13 views

Exercise Loans With Solutions

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Exercise Loans With Solutions

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Exercise on data

warehousing
Data Management
A.Y. 2022/23
Maurizio Lenzerini
Conceptual Schema of the operational data
Logical schema of the operational data
Person(SSN)
foreign key Person[SSN] ⊆ HasIncome[person]
HasIncome(person,date,qty,incomeClass)
foreign key HasIncome[person] ⊆ Person[SSN]
HasChild(parent,child)
foreign key HasChild[parent] ⊆ Person[SSN]
foreign key HasChild[child] ⊆ Person[SSN]
Loan(borrower,branchcode,bankcode,startdate,enddate,amount,rate,status,category,purpose)
foreign key Loan[borrower] ⊆ Person[SSN]
foreign key Loan[branchcode,bankcode] ⊆ Branch[code,bank]
Branch(code,bank,city)
foreign key Branch[bank] ⊆ Bank[code]
Bank(code,name)
HasDiscount(borrower,branchcode,bankcode,startdate,dcode)
foreign key HasDiscount[borrower,branchcode,bankcode,startdate] ⊆
Loan[borrower,branchcode,bankcode,startdate]
foreign key HasDiscount[dcode] ⊆ DiscountDecision[code]
DiscountDecision(code,date,type,amount)
Requirements for data warehousing
For every loan we are interested in the starting date, the end date, the borrower (with SSN,
number of children, income at the starting date, class of income), the branch of the bank where
the loan was issued (with name and city of the branch), the category (e.g., fixed rate or one of
different types of variable rate), the purpose of the loan (e.g., to buy a car, a house, a personal
loan, ...), the discount type (if it applies) and the status (fully repaid, defaulted, ...). Also, for
every loan the following values are of interest: the amount, the interest rate and the amount of
discount (if it applies).
Typical questions for starting OLAP sessions on the data warehouse by business analysts:
• Give the average interest rate, per loan category and branch in the various months of the start
date.
• For all branches, give the minimum, maximum interest rate per loan purpose and per class of
income of the borrower.
• Give the number of loans and the average duration (in years) per branch and per number of
children of the borrower.
• Give the percentage of defaulted loans per month of the end year and per city of the branch
where the loan was issued.
What to do in the exercise

1. Produce the DFM schema of the data warehouse


2. Derive the Star schema corresponding to the DFM schema
3. Design the ETL processes that load the tables in the Star schema
4. Write the queries for capturing the above mentioned typical
questions for the OLAP sessions
DFM schema SSN numChildren

income
borrower incomeClass

date start
Loan branch
month amount
year branchCity
interestRate
end discount*

status category purpose discountType

Note:
There may exist some fact with no value associated to “discount” (see *).

Integrity constraints:
• “borrower”, “branch” and “start” form an identifier for “Loan”
• For every fact F of type Loan, the fact F has a value for “discount” if and only if it has a value for the dimension
“discountType”.
Borrower
Star schema
Dimension
keyBorrower Tables
Table
borrower Branch
SSN keyBranch
numChildren branchcode
income bankcode
Dimension incomeClass branchCity
Table
Fact
Loan Status
Table borrower keyStatus
Date branch status
start
keyDate Category
date end
keyCategory
month status
category
year category
purpose Purpose
discountType keyProperties
Integrity constraints:
purpose
• “borrower”, “branch” and “start” form a key amount
for “Loan” interestRate DiscountType
• For every tuple t of Loan, then t.discount is discount*
keyDiscountType
NULL if and only if t.discountType is NULL DiscountType
Example of ETL query
We consider the ETL process (based on a query, in this case) that loads the data
into the dimension table “Borrower”. Here is the SQL code corresponding to such
process:

insert into Borrower(keyBorrower,SSN,income,incomeClass,numChildren) as


select serial as keyBorrower, p.SSN, n.Qty as Income, n.IncomeClass
count(c.child) as numChildren
from Person p join HasIncome n on p.SSN = n.person left join HasChild c
on p.SSN = c.parent
group by p.SSN,n.Qty,n.IncomeClass
Example of OLAP query

We consider the first query: Give the average interest rate, per loan category and
branch in the various months of the start date.
Here is the SQL code corresponding to such query:

select d.month, d.year, p.category, b.branch, avg(a.interestRate)


from Loan a, Properties p, Branch b, Date d
where a.properties = p.keyProperties and a.branch = b.keyBranch and
a.start = d.keyData and a.borrower = b.keyBorrower
group by d.month, d.year, p.category, b.branch

You might also like