Using SQL commands to create and manage semantic views¶
This topic explains how to use the following SQL commands to create and manage semantic views:
This topic also explains how to call the following stored procedure and function to create a semantic view from a YAML specification and get the specification for a semantic view:
Privileges required to create or replace a semantic view¶
To create or replace a semantic view, you must use a role with the following privileges:
CREATE SEMANTIC VIEW on the schema where you are creating the semantic view.
USAGE on the database and schema where you are creating the semantic view.
SELECT on the tables and views used in the semantic view.
For information about the privileges required to query a semantic view, see Privileges required to query a semantic view.
Creating a semantic view¶
To create a semantic view, you can either:
Run the CREATE SEMANTIC VIEW command.
Call the SYSTEM$CREATE_SEMANTIC_VIEW_FROM_YAML stored procedure, if you want to create a semantic view from a YAML specification.
The semantic view must be valid. See How Snowflake validates semantic views.
The next sections explain how to create a semantic view:
Using the CREATE SEMANTIC VIEW command¶
The following example uses the CREATE SEMANTIC VIEW command to create a semantic view.
The example uses the TPC-H sample data available in Snowflake. This data set contains tables that represent a simplified business scenario with customers, orders, and line items.
The example creates a semantic view named tpch_rev_analysis, using the tables in the TPC-H data set. The semantic view
defines:
Three logical tables (
orders,customers, andline_items).A relationship between the
ordersandcustomerstables.A relationship between the
line_itemsandorderstables.Facts that will be used to calculate metrics.
Dimensions for the customer name, the order date, and the year in which the order was placed.
Metrics for the average value of an order and the average number of line items in an order.
The next sections explain this example in more detail:
Note
For a full example, see Example of using SQL to create a semantic view.
Defining the logical tables¶
In the CREATE SEMANTIC VIEW command, use the TABLES clause to define the logical tables in the view. In this clause, you can:
Specify the physical table name and an optional alias.
Identify the following columns in the logical table:
Columns that serve as primary keys.
Columns that contain unique values (other than the primary key columns).
You can use these columns to define relationships in this semantic view.
Add synonyms for the table (for enhanced discoverability).
Include a descriptive comment.
Note
If there are multiple ways in which two tables can be joined, you should define a separate logical table for each of these ways. For information, see Defining different logical tables for different paths that join two tables.
In the example presented earlier, the TABLES clause defines three logical tables:
An
orderstable containing the order information from the TPC-Horderstable.A
customerstable containing the customer information from the TPC-Hcustomerstable.A
line_itemstable containing the line items in orders from the TPC-Hlineitemtable.
The example uses the PRIMARY KEY clause to identify the columns to be used as primary keys for each logical table. Primary keys and unique values help determine the types of relationships between the tables (for example, many-to-one or one-to-one).
The example also provides synonyms and comments that describe the logical tables and make the data easier to discover.
Defining different logical tables for different paths that join two tables¶
If there are multiple paths that you can use to join two physical tables, you should define separate logical tables and relationships for each path.
For example, in the TPC-H sample data available in Snowflake, there are two possible ways to
join the region and lineitem tables:
region->nation->supplier->partsupp->lineitemregion->nation->customer->orders->lineitem
The first path represents the region of the supplier, and the second path represents the region of the customer.
Although you can use a single logical table for region and a single logical table for nation, you should define separate
logical tables for the region of the supplier, the region of the customer, the nation of the supplier, and the nation of the
customer:
Then, define separate relationships that represent the different paths:
Identifying the relationships between logical tables¶
In the CREATE SEMANTIC VIEW command, use the RELATIONSHIPS clause to identify the relationships between the tables in the view. For each relationship, you specify:
An optional name for the relationship.
The name of the logical table containing the foreign key.
The columns in that table that define the foreign key.
The name of the logical table containing the primary key or columns with unique values.
The columns in that table that define the primary key or that contain unique values.
If you already specified PRIMARY KEY for the logical table in the TABLES clause, you don’t need to specify the primary key column in the relationship.
If there is a single UNIQUE keyword for the logical table in the TABLES clause, you don’t need to specify the corresponding columns in the relationship.
You can also specify a date, time, timestamp, or numeric column, if you want to join the columns on a range.
In the example presented earlier, the RELATIONSHIPS clause specifies two relationships:
A relationship between the
ordersandcustomerstables. In theorderstable,o_custkeyis the foreign key that refers to the primary key in thecustomerstable (c_custkey).A relationship between the
line_itemsandorderstables. In theline_itemstable,l_orderkeyis the foreign key that refers to the primary key in theorderstable (o_orderkey).
Using a date, time, timestamp, or numeric range to join logical tables¶
By default, when you specify a relationship between two logical tables, the tables are joined on an equality condition.
If you need to join two logical tables on a date, time, timestamp, or numeric range (where the values in a column of one table need to be in the same range as the values in a column of another table), you can specify the ASOF keyword with the column name in the REFERENCES clause:
A query of the semantic view defined above produces an ASOF JOIN that uses the
>= comparison operator in the MATCH_CONDITION clause. This joins the two tables so that the values in col_table_1 are
greater than or equal to the values in col_table_2:
Note
No other comparison operator in MATCH_CONDITION clause is supported.
You can use the ASOF keyword for columns of the same types that you can use with ASOF JOIN.
Note
You can specify at most one ASOF keyword in the definition of a given relationship. You can specify this keyword before any column in the list.
For example, suppose that you have tables containing customer, customer address, and order data:
In this example, the customer_address table has a ca_start_date column, which indicates when the customer started residing
at the specified address. The orders table has a o_ord_date column, which is the date of the order.
Suppose that you want to be able to query information about customer orders and retrieve the zip codes corresponding to where the customer resided when the orders were placed.
You can define a semantic view that specifies an ASOF join between the ca_start_date and o_ord_date columns:
Suppose that you query this semantic view to return the sum of the order amounts per month for each zip code:
The query effectively uses an ASOF JOIN to join the tables on the date columns, where the order date is greater than or equal to the address starting date:
Joining logical tables that contain ranges of values¶
You can use a range join when you want to join a table with another table that defines a range of possible values in the first table. For example, suppose that one table represents sales orders and has a column with the timestamp when the order was placed. Suppose that another table represents fiscal quarters and contains the distinct ranges of time that represent these quarters. You can create a semantic view that joins the two tables so that the row for an order includes the fiscal quarter in which the order was placed.
In the table that contains the ranges, each range must be distinct. No two ranges can overlap.
In the table data, if you want to specify the lowest possible value for the range or the highest possible value for the range, use NULL.
For example, the following table defines a set of ranges of times that do not overlap:
The first row covers the range that includes everything up to (but not including) January 1, 2024.
The last row covers the range that includes everything from March 20, 2024, onwards.
Note
No two rows can contain NULL in the start column, and no two rows can contain NULL in the end column.
For cases like these, you can set up a semantic view that supports range-join queries. When you create the semantic view, you must do the following:
For the logical table containing the start and end times of a time period, define a constraint that specifies that no two ranges can overlap.
In the TABLE clause of the CREATE SEMANTIC VIEW command, specify the CONSTRAINT clause in the logical table definition. For the syntax, see the documentation for CONSTRAINT in the CREATE SEMANTIC VIEW topic.
Define a relationship between the column containing the timestamp in one table and the start and end time columns in the other table.
In the RELATIONSHIPS clause of the CREATE SEMANTIC VIEW command, use the BETWEEN clause to specify the columns containing the start and end times. For the syntax, see the documentation for RELATIONSHIP in the CREATE SEMANTIC VIEW topic.
For example, suppose that the my_time_periods table defines distinct periods of time:
Suppose that the my_events table captures events that occurred within those periods of time:
You can define a semantic view that joins the tables. Rows in my_events are joined with rows in my_time_periods,
where the value in the event_timestamp column in my_events is within the range specified by the start_time and
end_time columns in my_time_periods.
The following query demonstrates how the rows are joined:
As shown in the examples, the dim_time_period_name dimension for each row in the results is the name of the time period that
the dim_event_timestamp dimension falls into.
Defining facts, dimensions, and metrics¶
In the CREATE SEMANTIC VIEW command, use the FACTS, DIMENSIONS, and METRICS clauses to define the facts, dimensions, and metrics in the semantic view.
You must define at least one dimension or metric in the semantic view.
For each fact, dimension, or metric, you specify:
The logical table it belongs to.
Note
If you want to define a derived metric (a metric that is not specific to one logical table), you must omit the logical table name. See Defining derived metrics.
A name for the fact, dimension, or metric.
The SQL expression to calculate it.
Note
For dimensions, you can specify a Cortex Search Service to use for the dimension. For information, see Defining a dimension that uses a Cortex Search Service.
Optional synonyms and comments.
Note
If a metric should not be aggregated across specific dimensions, you should specify that those dimensions should be non-additive.
For information, see Identifying the dimensions that should be non-additive for a metric.
The example presented earlier defines several facts, dimensions, and metrics:
Note
For additional guidelines on defining metrics that use window functions, see Defining and querying window function metrics.
Defining a dimension that uses a Cortex Search Service¶
To define a dimension that uses a Cortex Search Service, set the WITH CORTEX SEARCH SERVICE clause to the name of the Cortex Search Service. If the service is in a different database or schema, qualify the name of the service. For example:
Defining derived metrics¶
When you define a metric, you specify the name of the logical table that the metric belongs to. This is the logical table on which the metric is aggregated.
If you want to define a metric based on metrics from different logical tables, you can define a derived metric. A derived metric is a metric that is scoped to the semantic view (rather than to a specific logical table). A derived metric can combine metrics from multiple logical tables.
In the definition of a derived metric, omit the logical table name.
For example, suppose that you want to define a metric my_derived_metric_1 that is the sum of the metrics table_1.metric_1
and table_2.metric_2. When you define my_derived_metric_1, don’t qualify the name with any logical table name:
You can use other derived metrics in the expression. For example:
Note the following restrictions when you define a derived metric:
You cannot use the same name for a derived metric and a regular metric.
The expression for a derived metric can use:
Aggregations of dimensions and facts defined in any logical table in the semantic view.
Scalar expressions of metrics defined in any logical table in the semantic view.
Other derived metrics.
In the following example:
derived_metric_1uses a scalar expression with two metrics.derived_metric_2uses an aggregation of a dimension.derived_metric_3adds an aggregation of a dimension to another derived metric.
You don’t need to qualify the name of a metric, dimension, or fact in the expression if the name is not ambiguous. For example:
Note that
metric_1needs to be qualified bytable_1because there are two metrics namedmetric_1, butmy_unique_metric_namedoes not need to be qualified because the name is unique.In the expression for a derived metric, you cannot use the following:
Aggregations of metrics.
Window functions.
References to physical columns.
References to facts or dimensions that are not aggregated.
You cannot use a derived metric in the expression for a regular metric, dimension, or fact. Only another derived metric can use a derived metric in its expression.
Identifying the dimensions that should be non-additive for a metric¶
In some cases, a metric should not be aggregated across specific dimensions. In these cases, you can mark the dimensions as non-additive.
Understanding the problem with aggregating metrics across some dimensions¶
Suppose you have a table that contains the account balances of each customer’s checking and savings accounts on a specific day.
Suppose that you want to define a semantic view that includes:
The following dimensions:
Customer ID
Account type
Year
Month
Day
A metric for the sum of the balance.
The following statement creates a semantic view that includes the dimensions and metrics listed above:
If you want to retrieve the total balance of the checking and savings accounts for each customer at the end of each year, you can
query the semantic view for the m_account_balance metric and specify the customer_id_dim and year_dim dimensions.
However, the m_account_balance metric will be the sum of the balances of each day for each customer because the metric is
aggregated by the date dimensions.
In the example above, for cust-001 in 2024, 910 is the sum of the balances for each day
(100 + 110 + 140 + 150 + 200 + 210).
Preventing a metric from being aggregated across specific dimensions¶
To prevent the metric from being aggregated by the date dimensions, specify the date dimensions in the NON ADDITIVE BY clause when creating the semantic view:
Note
If you specify the NON ADDITIVE BY clause in a metric, you cannot refer to that metric in the definitions of metrics that are not derived. Only derived metrics can refer to metrics that specify non-additive dimensions.
Specifying the NON ADDITIVE BY clause makes the metric a semi-additive metric.
When you query this semantic view, the m_account_balance metric is no longer aggregated by the date dimensions. The query
aggregates the account balances at the end of the period in each group of queried dimensions.
In the example above, for cust-001 in 2024, 210 is the sum of the checking and savings account balances for the last day
of the year that contains data:
The last day of 2024 that contains data is
2024-03-30.There is no row with that date for the checking account, so the resulting metric is the balance of the savings account (
210).
As another example, if you just want the total account balance for all customers at the end of the year, you can specify the
year_dim dimension.
Because the date dimensions are marked as non-additive, the query sums the values at the end of the period (by date) for the checking and savings account balances for each customer.
During query processing, the rows are sorted by the non-additive dimensions, and the values from the last rows (the latest snapshots of values) are aggregated to compute the metric.
Note
Because the rows are sorted by the non-additive dimensions, the order in which you specify the dimensions is important. This is similar to the order in which you specify columns in the ORDER BY clause.
Specifying the sort order for non-additive dimensions¶
As demonstrated in the example, the metric aggregates the values of the checking and savings balances for each customer at the end of a period. If you want to change the sort order, you can specify the ASC or DESC keyword next to the dimension name. For example:
In this example, the metric evaluates to the earliest date specified by year_dim, month_dim, and day_dim.
If the dimension includes NULL values, you can use the NULLS FIRST or NULLS LAST keywords to specify whether NULL values are sorted first or last in the results:
Marking a fact or metric as private¶
If you are defining a fact or metric only for use in calculations in the semantic view and you don’t want the fact or metric to be returned in a query, you can specify the PRIVATE keyword to mark the fact or metric as private. For example:
Note
You cannot mark a dimension as private. Dimensions are always public.
When you query a semantic view that has private facts or metrics, you cannot specify a private fact or metric in the following clauses:
The SELECT list
FACTS in the SEMANTIC_VIEW clause
METRICS in the SEMANTIC_VIEW clause
METRICS
WHERE in the SELECT statement or the SEMANTIC_VIEW clause
Some commands and functions include private facts and metrics:
Private facts and metrics do appear in the output of the DESCRIBE SEMANTIC VIEW command. The rows for private facts and metrics have
PRIVATEin theaccess_modifiercolumn.Private facts and metrics are listed in the return value of a GET_DDL function call, as noted in Getting the SQL statement for a semantic view.
Some commands and functions include private facts and metrics only under specific conditions:
Private facts and metrics are listed in the INFORMATION_SCHEMA SEMANTIC_FACTS and SEMANTIC_METRICS views only if you are using a role that has been granted the REFERENCES or OWNERSHIP privilege on the semantic view.
Otherwise, these views list only the public facts and metrics.
Other commands and functions do not include private facts and metrics:
Private facts do not appear in the output of the SHOW SEMANTIC FACTS command.
Private metrics do not appear in the output of the SHOW SEMANTIC METRICS command.
Providing custom instructions for Cortex Analyst¶
In a semantic view, you can provide instructions for Cortex Analyst that explain how to:
Generate the SQL statement
Classify questions and prompt for additional information
To provide these custom instructions, use the following clauses:
For instructions on how to generate the SQL statement, use the AI_SQL_GENERATION clause in the CREATE SEMANTIC VIEW command.
For example, to tell Cortex Analyst to generate the SQL statement so that all numeric columns are rounded to two decimal points, specify the following:
For instructions on how to classify questions, use the AI_QUESTION_CATEGORIZATION clause.
For example, to tell Cortex Analyst to reject questions about users, specify the following:
You can also provide instructions to ask for more details, if the question isn’t clear. For example:
Creating a semantic view from a YAML specification¶
To create a semantic view from a YAML specification, you can call the SYSTEM$CREATE_SEMANTIC_VIEW_FROM_YAML stored procedure.
First, pass TRUE as the third argument to verify that you can create the semantic view from the YAML specification.
The following example verifies that you can use a given semantic model specification in YAML to create a semantic view named
tpch_analysis in the database my_db and schema my_schema:
If the specification is valid, the stored procedure returns the following message:
If the YAML syntax is invalid, the stored procedure throw an exception. For example, if a colon is missing:
the stored procedure throws an exception, indicating that the YAML syntax is invalid:
If the specification refers to a physical table that does not exist, the stored procedure throws an exception:
Similarly, if the specification refers to a primary key column that does not exist, the stored procedure throws an exception:
You can then call the stored procedure without passing in the third argument to create the semantic view.
The following example creates a semantic view named tpch_analysis in the database my_db and schema my_schema:
Modifying the comment for an existing semantic view¶
To modify the comment for an existing semantic view, run the ALTER SEMANTIC VIEW command. For example:
Note
You can’t use the ALTER SEMANTIC VIEW command to change properties other than the comment. To change other properties of the semantic view, replace the semantic view. See Replacing an existing semantic view.
You can also use the COMMENT command to set a comment for a semantic view:
Replacing an existing semantic view¶
To replace an existing semantic view (for example, to change the definition of the view), specify OR REPLACE when executing CREATE SEMANTIC VIEW. If you want to preserve any privileges granted on the existing semantic view, specify COPY GRANTS. For example:
Listing semantic views¶
To list semantic views in the current schema or a specified schema, run the SHOW SEMANTIC VIEWS command. For example:
The output of the SHOW OBJECTS command includes semantic views. In the kind column, the type of
object is listed as VIEW. For example:
You can also query the views for semantic views in the ACCOUNT_USAGE and INFORMATION_SCHEMA schemas.
Listing dimensions, facts, and metrics¶
To list the dimensions, facts, and metrics that are available in a view, schema, database, or account, you can run the following commands:
By default, the commands list the dimensions, facts, and metrics that are available in semantic views defined in the current schema:
The following examples demonstrate how to list the dimensions, facts, and metrics for semantic views within different scopes:
List the dimensions, facts, and metrics in semantic views in the current database:
List the dimensions, facts, and metrics in semantic views in a specific schema or database:
List the dimensions, facts, and metrics in semantic views in the account:
List the dimensions, facts, and metrics in a specific semantic view:
If you are querying a semantic view, you can use the SHOW SEMANTIC DIMENSIONS FOR METRIC command to determine which dimensions you can return when specifying a given metric. For details, see Choosing the dimensions that you can return for a given metric.
When you run the SHOW COLUMNS command for a semantic view, the output includes the dimensions, facts,
and metrics in the semantic view. The kind column indicates if the row represents a dimension, fact, or metric.
For example:
Viewing the details about a semantic view¶
To view the details of a semantic view, run the DESCRIBE SEMANTIC VIEW command. For example:
Getting the SQL statement for a semantic view¶
You can call the GET_DDL function to retrieve the DDL statement that created a semantic view.
Note
To call this function for a semantic view, you must use a role that has been granted the REFERENCES or OWNERSHIP privilege on the semantic view.
When calling GET_DDL, pass in 'SEMANTIC_VIEW' as the object type. For example:
The return value includes private facts and metrics (facts and metrics that are marked with the PRIVATE keyword).
Getting the YAML specification for a semantic view¶
To get the YAML specification for a semantic view, call the SYSTEM$READ_YAML_FROM_SEMANTIC_VIEW function.
The following example returns the YAML specification for the semantic view named tpch_analysis in the database my_db and
schema my_schema:
Exporting a semantic view to a Tableau Data Source (TDS) file¶
To export a semantic view to a Tableau Data Source (TDS) file, call the SYSTEM$EXPORT_TDS_FROM_SEMANTIC_VIEW function.
The following example returns the TDS file content for the semantic view my_sv_for_export:
Copy the XML to a .tds file and open the file in Tableau Desktop.
Tableau Desktop displays a folder for each logical table in the list of folders on the left. The names of the folders use spaces
instead of underscores, and each word starts with an uppercase letter. For example, the folder name for the date_dim logical
table is Date Dim.
Each folder contains Tableau dimensions and measures that correspond to the dimensions, facts, and metrics in the semantic view.
The next sections provide more detail and the limitations of the conversion process:
About the conversion¶
The function converts dimensions, facts, and metrics in the semantic view to the following equivalents in the Tableau TDS file:
Element in the semantic view |
Tableau equivalent (dimension or measure) |
How the data is aggregated |
|---|---|---|
Dimension |
|
|
Numeric fact |
Measure |
SUM |
Non-numeric fact |
Dimension |
|
Numeric metric |
Measure |
The TDS file uses a calculated field in place of the metric. The calculated field passes the value of the metric to the Snowflake AGG function. |
Non-numeric metric |
Dimension |
|
Numeric derived metric |
Measure |
The TDS file uses a calculated field in place of the metric. The calculated field passes the value of the metric to the Snowflake AGG function. |
Non-numeric derived metric |
Dimension |
|
The following Snowflake data types are mapped to corresponding Tableau TDS data types:
Snowflake data type |
Equivalent Tableau data type |
|---|---|
NUMBER/FIXED (if the scale is greater than 0) |
real |
NUMBER/FIXED (if the scale is 0 or null) |
integer |
FLOAT or DECFLOAT |
real |
STRING or BINARY |
string |
BOOLEAN |
boolean |
TIME |
time |
DATE |
date |
DATETIME or TIMESTAMP |
datetime |
GEOGRAPHY |
spatial |
Semi-structured (VARIANT, OBJECT, ARRAY), structured (ARRAY, OBJECT, MAP), unstructured (FILE), GEOMETRY, UUID, VECTOR |
string |
The TDS file has the following capabilities customized for the connection to Snowflake:
Customization name |
Value |
Effect of the customization |
|---|---|---|
|
|
Prevents Tableau from actually running a query like |
|
|
Prevents Tableau from “preparing” a statement (sending it to Snowflake to be parsed without executing) to learn about types. |
|
|
Prevents Tableau from using a |
|
|
Forces Tableau to enable and use the standard ODBC |
|
|
Prevents Tableau from escaping underscores when searching for the database name. |
Limitations when using a semantic view in Tableau Desktop¶
The following limitations apply to semantic views in Tableau Desktop:
You cannot create an extract from a semantic view.
If you change your connection from Live to Extract, Tableau Desktop fails with the following error:
You cannot use the Measure Values field in a semantic view.
If you select the Measure Values field in a semantic view, Tableau Desktop reports the following error:
You cannot select the Count field in a semantic view.
If you select SemanticViewName(Count), Tableau Desktop reports the following error:
Tableau Desktop cannot report the number of rows in the semantic view because the number of rows can vary, depending on the dimensions, facts, and metrics that are specified in the query.
You cannot drag a measure by itself.
If you drag a measure, Tableau Desktop reports the following error:
You cannot directly use a non-numeric metric.
SYSTEM$EXPORT_TDS_FROM_SEMANTIC_VIEW converts non-numeric metrics to dimensions in Tableau. If you attempt to use one of these dimensions, Tableau Desktop reports the following error:
To work around this, convert the dimension to a measure:
Right-click on the dimension, and select Convert to Measure.
This converts the dimension to a measure, using the default aggregation Count (Distinct).
To use a different aggregation, right-click on the converted measure, select Default Properties » Aggregations, and select the aggregation that you want to use.
Renaming a semantic view¶
To rename a semantic view, run ALTER SEMANTIC VIEW … RENAME TO …. For example:
Removing a semantic view¶
To remove a semantic view, run the DROP SEMANTIC VIEW command. For example:
Granting privileges on semantic views¶
Semantic view privileges lists the privileges that you can grant on a semantic view.
The following privileges on a semantic view are required to work with the view:
Any privilege (for example, MONITOR, REFERENCES, or SELECT) on a view is required to run the DESCRIBE SEMANTIC VIEW command on that view.
Any privilege on a view is required to display that view in the output of the SHOW SEMANTIC VIEWS command.
SELECT is required to query the semantic view.
Note
To query a semantic view, you don’t need the SELECT privilege on the tables used in the semantic view. You only need the SELECT privilege on the semantic view itself.
This behavior is consistent with the privileges required to query standard views.
To use a semantic view that you do not own in Cortex Analyst, you must use a role that has the REFERENCES and SELECT privileges on that view.
To grant the REFERENCES and SELECT privileges on a semantic view, use the GRANT <privileges> … TO ROLE
command. For example, to grant the REFERENCES and SELECT privileges on the semantic view named my_semantic_view to the role
my_analyst_role, you can run the following statement:
If you have a schema containing semantic views that you want to share with Cortex Analyst users, you can use future grants to grant the privileges on any semantic view that you create in that schema. For example: