OBIEE Data Lineage Solution
OBIEE Data Lineage Solution
Various vendors offer data lineage solutions, but these can be expensive and vendor-
specific. With our simple solution, we combine Catalog Manager and Administration
Tool export sources to create an easily accessible solution for tracking data lineage in
OBIEE.
By implementing the OBIEE data lineage solution, we can check the following:
1. Which physical tables and columns are used in a given report or dashboard
2. Which reports use given physical columns or tables; this is especially important
when modifying an existing table, as any change in the table’s structure must take
existing reports into consideration
3. Which are the most commonly used columns in reports in a given subject area;
identifying the most commonly used columns in a report can hint at creating
indexes to improve the overall performance of the OBIEE implementation even
further.
Solution
The ClearPeaks OBIEE Data Lineage Solution gathers all the required data lineage
information in one place and uses it as a source for OBIEE reports within Data Lineage
Subject Area. Two sources are combined to achieve this:
Catalog Manager provides an option to export the list of reports and columns used.
The export can be done either with the Catalog Manager UI tool or through the
command line utility. We use the latter option, as it allows automation of the whole
process later.
Once we have obtained both files, we need to populate the data lineage tables.
The data can be transformed and manually inserted into the tables, but in our solution
we use a script (which can run on the OBIEE server) that parses the data and inserts it
into the tables.
Once we have populated the data lineage tables and their data model has been
mapped in Administration Tool, we can create and run reports in OBIEE using Data
Lineage Subject Area and filter the results according to our requirements.
Let us look at a few of the use cases for the Data Lineage Subject Area:
Use case 1. Which data warehouse tables and columns are used in a given
report?
We would like to know which data warehouse tables and columns are used in a
particular report or dashboard. We can create a report with a list of the columns used
by a given OBIEE report and their data lineage:
We want to know how many and which reports or dashboards are using given physical
tables or particular columns; this could be very useful when assessing the potential
impact of column formula or table structure changes on reporting. Using Data Lineage
Subject Area we can fetch up the list of OBIEE reports used by a given physical table:
We need to know which reports and dashboards are accessing data from given subject
areas. This may be particularly useful when revising users’ access permissions.
Future Improvements
OBIEE Data Lineage Subject Area can also serve as a backbone for further
developments, providing complex and complete solutions for managing existing OBIEE
project implementations. Here are some examples and potential benefits of merging
additional information into the solution:
Usage Tracking data – allows analysis of the physical tables used in the most
accessed reports, the tables and columns in reports not used by anyone, removing
non-used reports or underlying tables.
Data warehouse database metadata – such as table size, indexes on columns. This
allows for performance analysis of the most heavily used tables and columns by report
usage.
ETL Data Lineage – additional layer of data lineage tracking – allowing the tracking of
data back to the very source – can be achieved by adding ETL transformation data
obtained from ETL systems. For example, it is possible to track all the ETL
transformations on a given presentation column down to the source system.
Adding all the above components creates a strong core for the OBIEE management
dashboard, allowing continuous improvement through:
All of the above is accessible from the OBIEE front-end, providing a convenient and
quick way to facilitate many daily business-as-usual tasks in BI deployments.
Conclusion
The ClearPeaks OBIEE Data Lineage Solution can be easily deployed in any project
using Oracle Business Intelligence Enterprise Edition. The solution can be run from the
command line tools, which makes it possible to create automated jobs to extract and
update data on a regular basis.
If you think your OBIEE project could benefit from our data lineage solution, please
contact us for more information via our web contact form or by leaving your comments
below!