0% found this document useful (0 votes)
30 views

Scaling Data Science and Enterprise AI With A Semantic Layer

Uploaded by

louis.l.neal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Scaling Data Science and Enterprise AI With A Semantic Layer

Uploaded by

louis.l.neal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Scaling Data Science

and Enterprise AI with


a Semantic Layer
Create a Bridge between BI and AI
Table of Contents

3 The Gap Between AI and BI

4 Challenges to Scaling Enterprise AI

6 Bridging the Gap with a Semantic Layer

7 Building Managed Features within the Semantic Layer

8 Curating Managed Features within a Semantic Layer Platform

10 Hardening ML Data Pipelines

11 Semantic Predictions

13 Deep Dive: A Focus on Time Series

15 AtScale for Bridging AI and BI


The Gap Between

www.atscale.com
AI and BI

Scaling Data Science and Enterprise AI with a Semantic Layer


This eBook is the fourth in a series focused on the application of semantic layer
technology to modern cloud analytics.

The first wave of “big data” technologies focused on managing the petabytes and exabytes of data generated by humans
and machines spanning enterprise applications, online user behavior, smart devices, point of sale, lab data, etc… Cloud
data platforms made it more practical to consolidate this data within centralized repositories that could be more easily
analyzed. Business intelligence (BI) technologies evolved to support analysts and decision makers in making sense of
this data. BI focuses on supplying descriptive and diagnostic analytics to support business users in creating insights that
support better decision making. “The collision of once
disparate data and

Data Science and Enterprise AI/ML are suggested actions to optimize a There is a clear opportunity to address analytics markets, fueled
has emerged over the past several business process. these challenges by leveraging by the proliferation of
years to include a set of technologies well established BI infrastructure -
and techniques that leverage artificial Despite massive investment in especially when augmented with a
cloud and augmented
intelligence and machine learning data science and enterprise AI, modern semantic layer strategy. capabilities, provides the
models to generate predictive and organizations have struggled to create
return due the the complexity of
opportunity for data and
prescriptive insights from data.
Predictive insights, like sales forecasts, scaling AI models to production and analytics to drive business
help businesses more effectively plan driving systematic business adoption
driven decisions and
for the future. Prescriptive insights of AI-generated insights.
outcomes” - Gartner

3
Challenges to Scaling

www.atscale.com
Enterprise AI

Scaling Data Science and Enterprise AI with a Semantic Layer


Despite the hype and investment For one, data scientists and BI or able to leverage investments in Scaling enterprise AI programs is
professionals use different tools data infrastructure built to support a key challenge for organizations.
around data science and AI/ML, we and technologies. Data science mature BI programs. Compounding the ever-present
are still in the early stage of realizing teams spend their day working shortage of skills, data scientists
with Python, open source libraries, Data scientists use different often waste valuable time
the impact of enterprise AI. Most
containers, APIs, and AutoML terminology than their BI wrangling data. Once data teams
companies are just scratching the platforms. In contrast, business counterparts. BI users talk do generate valuable insights,
surface with harnessing the potential teams typically work with BI in the language of measures there is often no consistent
platforms like Tableau and Power BI and dimensions, and focus on mechanism for publishing insights
of AI and generating real business descriptive or diagnostic analytics.
to explore the same data sets. (e.g. predictions, suggestions) back
impact from their investments. Part Data scientists talk about features to BI teams and the rest of the
Data scientists are typically PhDs and predictive model results. Both
of the reason is the natural silos that organization.
or computer engineers on separate teams have the same goal – using
exist between data science teams teams from their BI counterparts. raw data to model their business
and the data teams built around BI BI teams may sit in IT or within a in order to generate insights that
line-of-business department. Since can help decision-makers be more
processes.
they operate independently, data effective. But the path they take is
scientists are not always aware of quite different.

4
from BI teams
The Silos Separating AI

Scaling Data Science and Enterprise AI with a Semantic Layer www.atscale.com


Bridging the Gap With

www.atscale.com
A Semantic Layer

Scaling Data Science and Enterprise AI with a Semantic Layer


As discussed in the first eBook of A semantic layer is where In this way, AtScale can be used to layer to inherit the semantic
business-oriented logical create a set of managed features context defined within the model.
this series, a semantic layer abstracts data models are created and that can be used for exploratory This approach creates a set of
away the complexity of raw data maintained. For BI, this is data analysis, model training, semantic predictions that can be
where metrics and dimensions or production models. As these explored from existing BI platforms
and creates a platform for teams to
are defined - tying the views managed features are queried alongside actuals using the
create business-focused views of key exposed to business users to in the form of BI queries, python same analysis dimensions. This
metrics and analysis dimensions that raw data sources. The modeling scripts, or API calls, a semantic simplifies business consumption of
environment also enables layer platform like AtScale serves AI-generated insights and makes
can be accessed by a broad range of
composability and reusability of query traffic to the underlying data it possible to reach much wider
data consumers. models and definitions. The same platform in the form of optimized audiences.
utilities can be used by data teams SQL. As will be discussed in a later
A semantic layer simplifies access to raw data, to build specific cuts of data section, this approach to feature A semantic layer like AtScale
exposing a business oriented representation of that define ML features. Defining serving simplifies data pipeline has the potential to reduce the
different enterprise data sources including data complex calculations or setting up creation and maintenance. complexity and cost of bringing
shared by partners and data acquired from third- time relative features of various new models to production as well
party sources. lags or windows can be done far As predictions are generated as generate greater return by
easier in a modeling tool than by AutoML or AI platforms, they encouraging business adoption of
writing complex SQL. can be written back to data AI.
platforms through the semantic
6
Building Managed

www.atscale.com
Features within the
Semantic Layer

Scaling Data Science and Enterprise AI with a Semantic Layer


Upwards of 80% of a Data Scientist’s time is spent preparing
data for exploration and model training. Feature engineering is
a substantial portion of this data wrangling time. The semantic Furthermore, the semantic layer It can also be highly advantageous
model can serve as a starting point for building a set of can be used to quickly build new to use governed dimensional
calculated metrics that can be hierarchies when engineering
business-vetted features. centrally managed and reused features versus reproducing
for different ML models. This is for every model. Standardized
The semantic layer is a repository for pre-defined business metrics that have already
particularly useful for time-relative dimensions for time, product
been defined by the business. This source of “business actuals” for any number of
measures including different geographies, and other master
metrics including sales revenue, costs, sales margins, unit sales, shipments, inventory
windows and lags for autoregressive data can be shared across different
levels can be directly leveraged by data science models with no wrangling or risk of
and moving average based time use cases. This can save time and
miscalculation.
series. Managing time relative ensures that predictions generated
metrics within the semantic by ML models will be usable and
layer radically simplifies data understandable by business users.
pipelines, makes sharing and reuse
possible, and reduces chance of
miscalculations.

7
Curating Managed Features

www.atscale.com
within the Semantic Layer

Scaling Data Science and Enterprise AI with a Semantic Layer


Curating a set of managed features
within AtScale, whether or not
they are integrated with a feature
Data scientists will draw on a wide range of data sources to build features for exploration
store, has a few advantages. First,
and training of new models. For data managed within a data warehouse or lakehouse,
they can be centrally governed for
the AtScale semantic layer can be used to centrally define managed features and ensure
definitional consistency that ensures
consistency across different use cases.
that business context is retained
Furthermore, new features created by AutoML or AI The AtScale semantic layer can be used in conjunction with regardless of where they get used.
platforms can also become managed features. For instance, a feature store like FEAST to aggregate the superset of Second, AtScale provides detailed
an AI platform may suggest an expanded set of time features regardless of where they are sourced. Features can
lineage information for even the most
relative features of different windows or lags that improve be served directly from AtScale, or through a feature store
model performance. AI-Link can write these features to train models in AutoML or other AI platforms. complicated calculated metrics.
back to the model as managed features. These managed Finally, managed features can be
features inherit full semantic context, making them more
protected from underlying changes to
discoverable and easier to work with, consistently, at any
stage in ML model development. data - in many cases data consumers
will be completely unaware if data
stewards redefine the source data
used to build a metric.

8
Integrating AtScale

www.atscale.com
Managed Features
with a Feature Store

Scaling Data Science and Enterprise AI with a Semantic Layer


Sample Python Code

project.create_feast_repo(project_name=”AtScale_AI”,
features=prediction_features,
entities=[‘item’, ‘state’],
timestamp=’date’,
view_name=’dataset_ts’,
ttl=datetime.timedelta(weeks=0),
force_rewrite=True)

9
Harden ML Pipelines

www.atscale.com
by Serving Features
from AtScale

Scaling Data Science and Enterprise AI with a Semantic Layer


It’s important that production AI/ML models have consistent
and reliable access to raw data. Changes in the raw data can
As managed features are requested Governance also becomes important
disrupt model pipelines. This makes it important to insulate via AI-Link by a python script, at the stage of feature serving.
production models from changes in the underlying data. AtScale dynamically pushes Real time enforcement of security
optimized queries to the underlying and access control policies can
There’s a need to harden data pipelines for production models data platform while ensuring high be managed within the semantic
so they can deliver feature data accurately, at scale, while performance. No data is persisted layer. Likewise, governance of
within the AtScale layer - all features query performance and data
meeting performance and recency requirements.
are delivered on-demand through platform resource consumption can
Managed features can serve as this necessary buffer from changes to raw data virtualized query pipelines with all be managed by a semantic layer
sources. AtScale exposes the semantic layer to AI workflows with bi-directional SQL-based transformations taking platform like AtScale.
Python connectivity, so models can draw directly from managed features that are place on the data platform.
connected to live data in an underlying cloud data warehouse or lakehouse.

10
Make AI-Generated Insights More

www.atscale.com
Accessible with Semantic Predictions

Scaling Data Science and Enterprise AI with a Semantic Layer


One of the most significant AtScale supports ML model results pipelines to drill up and down on large prediction datasets.
that manage the writeback of predictions to the Furthermore, predictions can be analyzed alongside
challenges for enterprise AI is driving underlying data platform while automatically business actuals using existing analysis dimensions
business adoption of AI generated inheriting the full context of the semantic data in the same BI dashboards.
model. This makes prediction datasets immediately
insights. Oftentimes, the predictions
accessible by business users using existing business The only way to drive business adoption of
generated by production models get intelligence infrastructure. enterprise AI is to integrate model-generated
stranded in isolated data science insights into existing enterprise analytics workflow.
AtScale uses the concept of Semantic Predictions This approach builds on infrastructure and data
tools or .csv file dumps to a data to represent the idea of prediction data sets with literacy initiatives already in place to support
lake. Deriving value from data full semantic context. Semantic Predictions can be traditional BI.
explored leveraging the dimensional hierarchies
science means getting AI-generated
insights in front of decision makers
quickly and within a familiar analytics
experience.

11
Consume Semantic Predictions Alongside Actuals

12

Scaling Data Science and Enterprise AI with a Semantic Layer www.atscale.com


Deep Dive: A focus on Business

www.atscale.com
Forecasting with Predictive Models

Scaling Data Science and Enterprise AI with a Semantic Layer


We’ve explored how AtScale democratizes Time series analysis is based on complex comparisons with the same day of the previous
statistical techniques and ML algorithms. week) and window measurements (e.g. three-
predictive and prescriptive analytics. Let’s Fortunately, the hard work of model selection and day running averages). Data sets use different
take a deeper look at how AtScale supports model training can be done within a variety of definitions and granularities in time dimension,
tools such as AutoML platforms. In practice, the and don’t necessarily come with the aggregation
data scientists as they support business
real challenge is managing the data used for time logic (such as hourly or daily) that data scientists
forecasting with time series and predictive series analysis. need. Models can incorporate data from
models, a core activity of enterprise AI. different systems (such as ERPs and CRMs) and
Time-relative data is complicated to work with. from outside the organization (e.g. an external
It involves creating and managing multiple temperature feed).
variations of calculations performed on the
same metric (e.g. sales). Most commonly, time
series data incorporates lag measurements (e.g.

13
www.atscale.com
Preparing and maintaining clean data Combining AtScale with a modern cloud data Cloud data platforms provide the tools for
platform like Snowflake or Databricks creates managing large tabular data sets. This data is
across these sources is complicated. a natural foundation for simplifying data typically structured for general analytical usage.
Furthermore, maintaining data consistency management for time series analysis. AtScale AtScale lets data scientists and data engineers
helps data teams create “conformed” time interact with business-ready forms of raw data
as models move into production is critical
dimensions that map different data sets to a with familiar tools like Python notebooks. AtScale
to producing reliable results. common hierarchical expression of time. This manages the translation of the managed feature

Scaling Data Science and Enterprise AI with a Semantic Layer


allows data scientists and supporting data teams (e.g. week over week change in sales) to the
to quickly and consistently create time-based raw SQL to pull the value from the raw data. This
features across different projects and at different form of feature serving is highly efficient, highly
stages of a model development. Time values accurate, and resilient to changes to underlying
and time-relative metrics become Managed data structure.
Features that are “pre-vetted” by the business
and available for direct use with a high degree of
accuracy.

14
AtScale for Bridging AI and BI

www.atscale.com
Scaling Data Science and Enterprise AI with a Semantic Layer
To stay competitive, enterprises need to The real goal of both AI and BI is to deliver A semantic layer platform like AtScale provides a
descriptive data and AI-augmented insights unique set of advantages that helps move more
focus on building scalable data science to business users to support broader insight models into production while directly supporting
and enterprise AI programs that deliver creation and decision making. This means business interaction with data and AI-generated
making data and AI available to users in whatever insights. Leverage existing BI infrastructure with
real business impact. AI enabled analytics
means and whatever form is most useful. Storing a modern semantic layer strategy provides a
have many of the same requirements as massive amounts of data and generating the path toward getting more value from cloud data
traditional BI. Despite the differences most accurate forecasts possible are both platforms and from AI investments. AtScale can
completely useless if not adopted and leveraged serve as a control plane for coordinating insight
discussed in this ebook, the objectives of
by business users. Many organizations are creation across an organization - ensuring data
both AI and BI are the same and modern struggling to show return on data, data science, assets are usable by both human users and
enterprise analytics programs need to and AI technology investments because they are AI resources as well as ensuring AI-generated
unable to move AI models to production. insights are accessible by business decision
address both. makers where and when they are useful.

15
Take the

www.atscale.com
Next Step

Scaling Data Science and Enterprise AI with a Semantic Layer


REQUEST A DEMO 

CONTACT US 

LEARN MORE  Read The Practical Guide to Using a Semantic Layer

Download The Complete Buyers Guide to a Semantic Layer

About AtScale Review How does using a semantic layer impact cloud data
AtScale enables smarter decision-making by accelerating warehouse performance? (benchmark reports)
the flow of data-driven insights. The company’s semantic
layer platform simplifies, accelerates, and extends business Get advice from fellow data & analytics leaders on how to
intelligence and data science capabilities for enterprise
customers across all industries. With AtScale, customers are
scale smarter data-driven decision-making
empowered to democratize data, implement self-service
BI and build a more agile analytics infrastructure for better, Stay current on analytics strategies with short articles and
more impactful decision making. For more information, webinars on topics in analytics strategies
please visit www.atscale.com and follow us on LinkedIn,
Twitter or Facebook.

16

You might also like