Star Query Versus Star Transformation Query: Which To Choose?
Star Query Versus Star Transformation Query: Which To Choose?
Which to Choose?
Michael Janesch
Innovative Consulting
Abstract
Star schema design is the backbone of the data warehouse architecture. The method to query the star schema has
evolved over the past releases of Oracle with enhancements to the Cost-Based Optimizer. Oracle7 introduced the
“Star Query” execution path and Oracle8 gave us the option of using the “Star Transformation Query” execution
path. This paper will track the evolution of querying the star schema and describe how both work so you will be
able to identify and implement which option is best for your data warehousing queries.
Star Schema:
The Star Schema is the basic design of the data warehouse. It is made up of a Fact table and several Dimension
tables. Multiple star schemas can exist in a data warehouse where fact tables may share the same dimensions.
Fact Table:
The Fact Table contains the quantitative information that defines what users will ultimately want to analyze. It’s
key components are made up of foreign keys to the dimension tables. Non-key components are the actual numeric
facts that will be reported, summarized, and analyzed. Fact tables are narrow in record width due to their numeric
nature and large in number of rows. These tables will typically be tens or hundreds of millions of rows. Examples
of facts would be sales and shipment tables.
Dimension Table:
Dimension tables contain the qualitative information that defines how users will analyze the fact information. The
key component is the single column that identifies the record as unique. The non-key components contain the
descriptive information about this record. The important thing to remember about dimension tables is that they are
denormalized, which increases the performance when querying in a star schema. Dimension tables are wide in
record width due to their descriptive character nature and small in number of rows as compared to the Facts to
which they relate. One dimension table that is always found in a data warehouse is Time.
Snowflake Schema:
A Star becomes a Snowflake when the dimension tables are no longer completely denormalized and have foreign
keys to other dimensions. This design may be necessary to save physical space or to fulfill requirements, but can
lead to performance issues and tuning challenges.
Non-Traditional Example
These methods are best demonstrated through examples of tables, queries, and execution paths. Instead of using
the traditional SALES, STORE, PRODUCT, and TIME star schema model, these examples will be explained
through the simple example of the FACT and 5 dimension (DIM#) tables. In the “real world”, these tables would
have more columns and appropriate storage clauses.
CREATE TABLE dim1( dim1_key NUMBER, dim1_attr VARCHAR2(100), CONSTRAINT dim1_pk PRIMARY KEY(dim1_key));
CREATE TABLE dim2( dim2_key NUMBER, dim2_attr VARCHAR2(100), CONSTRAINT dim2_pk PRIMARY KEY(dim2_key));
CREATE TABLE dim3( dim3_key NUMBER, dim3_attr VARCHAR2(100), CONSTRAINT dim3_pk PRIMARY KEY(dim3_key));
CREATE TABLE dim4( dim4_key NUMBER, dim4_attr VARCHAR2(100), CONSTRAINT dim4_pk PRIMARY KEY(dim4_key));
CREATE TABLE fact( dim1_key NUMBER, dim2_key NUMBER, dim3_key NUMBER, dim4_key NUMBER, dim5_key NUMBER,
The tables are populated with variations of the alphabet ( i.e. INSERT INTO dim1 VALUES(1,’a’), INSERT
INTO dim1 VALUES(2,’b’)… ) and have the following row counts:
DIM1( 26 ), DIM2( 30000 ), DIM3( 100000 ), DIM4( 26 ), DIM5( 26 ), and FACT( 2.7 million ).
Notice that DIM2 and DIM3 are the larger dimension tables for the examples.
• Create a composite B-tree index of dimension foreign keys in the FACT table. This key does not
necessarily have to be the primary key and multiple composite keys may be required based on the queries.
In this example, the composite unique index was created with the primary key constraint. The query will
only use composite indexes where the starting columns are used correctly (i.e. no column functions) in the
WHERE clause.
CREATE UNIQUE INDEX fact_pk ON fact (dim1_key, dim2_key, dim3_key, dim4_key, dim5_key);
• Use the STAR hint in the query if necessary to force the execution path. The CBO should identify a star,
but sometimes the data distributions or lack of quality statistics could require a hint to help the optimizer.
SELECT /*+ STAR */
Explain Plan
In SQL*PLUS use SET AUTOTRACE ON with your PLAN_TABLE accessible to produce an Explain Plan. This
plan will show you if you are actually executing a Star Query. Reading this plan(inside-out, top to bottom), you
can see that the CBO did execute a Star Query plan by doing Cartesian joins of the DIM tables and accessing the
FACT table last with the composite index.
Starting with the index read on the 100,000-row DIM3 table (A) and performing a Cartesian join(C) after
retrieving from the 30,000-row DIM2 table, it does the second Cartesian join (D) on the full table scan of the 26-
row DIM1 table (B). The result of this Cartesian join then becomes the driver of the NESTED LOOP operation to
retrieve from the composite index on the FACT table (E).
SELECT STATEMENT Optimizer=CHOOSE
NESTED LOOPS
INLIST ITERATOR
SORT (JOIN)
INLIST ITERATOR
SORT (JOIN)
• As with the Star Query Execution, analyze the tables and indexes for the Cost-Based Optimizer.
ANALYZE TABLE dim# COMPUTE STATISTICS;
• Create single column bitmap indexes on all of the foreign key columns in the fact table.
CREATE BITMAP INDEX fact_dim1_bm ON fact(dim1_key);
• Use the STAR_TRANSFORMATION hint in query if necessary to force the execution path.
SELECT /*+ STAR_TRANSFORMATION */
Sample Query
The following query is a good candidate for the Star Transformation query execution because it shows an example
of querying the FACT table outside the ordered composite index by skipping the DIM1 join. A better example
might be joining the FACT to 10-15 DIM tables, but that would produce an explain plan that would fall off the
pages of this paper.
SELECT dim2.dim2_attr, dim3.dim3_attr, dim5.dim5_attr, fact.fact1
SORT (JOIN)
BITMAP AND
BITMAP MERGE
BITMAP MERGE
BITMAP MERGE
AND fact.dim5_key IN (SELECT dim5.dim5_key FROM dim5 WHERE AND dim5.dim5_attr ('l','m'))
The following INSERT statement was extracted from the SGA and shows that the attribute index was used when
building the temporary segments for the larger dimension tables.
INSERT INTO "SYS"."ORA_TEMP_1_3"
Explain Plan
SELECT STATEMENT Optimizer=CHOOSE
INLIST ITERATOR
• In complex queries where the conditions in the WHERE clause are based on non-foreign key fact table
columns. These columns need to have bitmap indexes to benefit from the bitmap transformation.
Summary
One of my standard interview questions is “How do you implement queries on a star schema?”. The response is
usually, “I create the star schema and it automatically creates star queries from that.” My follow-up question is
“How do you implement queries on a star schema?”. This paper reviewed the Star Query and Star Transformation
Query execution methods as ways to implement queries on a star schema. Both attack the query differently and
can be effectively used when you understand your data distributions and querying requirements. Now you’ll be
bettered prepared for that question in the interview!
References
“Star Queries in Oracle8”, Oracle White Paper, June 1997
“Dimensional Modeling: A Review of the Star Schema”, Dr. Stephen R. Gardner
“Implementing Star Queries in Oracle7.2/7.3”, Kevin Loney
“How To Optimize Queries in A Star Schema Utilizing Star Transformation in Oracle8”, Vilin Roufchaie
Oracle7/8 Documentation