0% found this document useful (0 votes)

96 views

Architecting To Support Machine Learning

This document discusses machine learning system architectures and outlines key considerations when architecting such systems. It describes two main workflows for machine learning systems - model development and model serving. It then examines each step of these workflows in more detail, outlining responsibilities and important architectural concerns to address at each step, such as data storage, processing, model training and deployment. Finally, it provides examples of architectural decisions and case studies of machine learning systems implemented for real-world domains.

Uploaded by

Humberto Cervantes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views

Architecting To Support Machine Learning

Uploaded by

Humberto Cervantes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Architecting to

Support Machine
Learning

Humberto Cervantes, UAM

Iurii Milovanov, SoftServe
Rick Kazman, University of Hawaii
PARTICULARITIES OF ML SYSTEMS

● In ML systems, the behaviour is not speciﬁed directly in code but is learned from data

Traditional Programming Machine learning

Data Data

Computer Output Computer Model

Program Expected output

● At the core of the system, there is a model that uses data transformed into features to
perform predictions for particular tasks
● This model can be seen as a compiled software library that is part of a bigger system
TWO MAIN WORKFLOWS
Development environment

Raw historical Transformation Model selection Trained

data into features and training ML Model

data transformation rules data to model development

+ model refine model & model serving
data rules

Results
Transformation Trained
New raw data derived from
into features ML Model
prediction

Serving environment automatic

retraining
MODEL DEVELOPMENT LIFECYCLE

Shorter cycles of model development/reﬁnement are needed

(Refined) (Refined)
Model
Model Model ModelModel Model Model
Model Model
development serving
development refinement
serving refinement
refinement
serving serving
ARCHITECTING THE SYSTEM

Supporting initial model development and the

model reﬁnement lifecycle Introduces many |
architectural concerns:
“Architectural concerns encompass additional
aspects that need to be considered as part of
architectural design but which are not expressed
as traditional requirements.”

Identifying them is useful to perform the design of

a system that supports ML using a method such
as ADD
ARCHITECTING THE SYSTEM
We will look into more details in the steps of the workﬂows to discuss the concerns and
decisions that can be made to satisfy them

MODEL DEVELOPMENT

DATA CLEANSING MODEL

TRAINING DATA FEATURE MODEL
AND SELECTION AND
INGESTION
NORMALIZATION
ENGINEERING
TRAINING
PERSISTENCE activity and
data flow

step

MODEL
NEW DATA
DATA VALIDATION
FEATURES
TRANSFER AND
SERVING workflow
INGESTION EXTRACTION RESULTS
PREDICTION

MODEL SERVING
TRAINING DATA INGESTION

Responsibility
● Collect and store raw data for training
Architectural concerns
● Collect and store large volumes of training data, support fast bulk reading
○ Ingestion: Manual, Message broker, ETL Jobs
○ Storage: Object Storage, SQL or NoSQL, HDFS
● Labeling of raw training data
○ Data labelling toolkit: Intel’s CVAT, Amazon Sagemaker Ground Truth
● Protect access to sensitive data
DATA CLEANSING AND NORMALIZATION

Responsibility
● Identify and remove errors and duplicates from
selected data and perform data conversions
(such as normalization) to create a reliable data set.
Architectural concerns
● Provide mechanisms such as APIs to support query and visualization of the data
○ Data warehouse to support data analysis, such as HIVE
● Transform large volumes of raw training data
○ Data processing framework, such as Spark
FEATURE ENGINEERING

Responsibility
● Perform data transformations and augmentation to
incorporate additional knowledge to the training data
● Identify the list of features to use for training
Architectural concerns
● Transform large volumes of raw training data into features
● Provide mechanism for data segregation (training / testing)
● Features logging and versioning
○ Data versioning mechanism, such as Data Science Version Control System (DVC)
○ Use of a feature store, a data management platform that stores feature data, feature
engineering logic, metadata, and a registry for publishing and discovering features
and training datasets
MODEL SELECTION AND TRAINING
Responsibility
● Based on a selected algorithm, train, tune and
evaluate a model.
Architectural concerns
● Selection of a framework
○ TensorFlow, PyTorch, Spark MLlib, scikit-learn, etc.
● Select training location and provide environment and manage resources to train,
tune and evaluate a model
○ Single vs distributed training, Hardware acceleration (GPU/TPU)
○ Resource Management (e.g. Yarn, Kubernetes)
● Log and monitor training performance metrics
MODEL PERSISTENCE

Responsibility
● Persist the trained and tuned model (or entire
pipeline) to support transfer to the serving
environment
Architectural concerns
● Persistence of the model
○ Examples: Spark MLlib Pipelines, PMML, MLeap, ONNX
● Storage of the model
○ Examples: Database, document storage, object storage, NFS, DVC
● Optimize model after training (e.g. reduce size for use in constrained device)
○ Example: Tensorﬂow Model Optimization Toolkit
NEW DATA INGESTION

Responsibility
● Obtain and import unseen data for predictions
Architectural concerns
● Batch prediction: asynchronously generate predictions for multiple input data
observations.
● Online (or real-time) prediction: synchronously generate predictions for individual
data observations (request/response or streaming).
DATA VALIDATION AND FEATURE EXTRACTION

Responsibility
● Process raw data into features according to
the transformation rules defined during model
development
Architectural concerns
● Ensure data conforms to the characteristics defined during training
○ Usage of a data schema defined during model development
○ Monitoring of changes in input distributions
● Design batch and/or streaming pipelines
○ Realtime data storage (e.g. Cassandra)
○ Data processing framework (e.g. Spark)
● Select and query additional real-time data sources (for feature extraction)
MODEL TRANSFER AND PREDICTION

Responsibility
● Transfer (deployment) of model code and
perform predictions
Architectural concerns
● Model transfer and validation
○ Transfer: re-writing, docker, PMML…
● Rolling out / rolling back new model version
○ Blue-green deployment, Canary-testing
○ Support for multiple model versions, update and rollback mechanisms, for example
using TensorFlow serving
● Deﬁne prediction location
PREDICTION LOCATION
Local model: the model predicts/re-trains on the client side
client machine
ML Model

Remote model: the model predicts/re-trains on the server side

data for prediction server machine
client machine
results ML Model

Hybrid model predicts on client and re-trains on both (federated learning)

client machine model deltas server machine

Local ML Model model updates Global ML Model
SERVING RESULTS

Responsibility
● Delivery and post-processing of prediction results
to a destination
Architectural Concerns
● Monitor model staleness (age) and performance
● Monitoring predictions vs. actual values when possible
● Monitoring deviations between distribution of predicted and observed labels
● Storage prediction results
● Aggregate results from multiple models
CASE STUDIES
NEW DOMAIN UNDERSTANDING
CASE STUDY • SoftServe worked with two Fortune 100 companies – a hardware and
DISTRIBUTED IOT networking provider, and an energy exploration and production
company – to research the oil extraction process
NETWORK ACROSS OIL • SoftServe suggested a solution and architecture design to match the
& GAS PRODUCTION client need for a distributed ﬁber-optic sensing (IoT) program.

DOMAIN-SPECIFIC TECHNOLOGY CHALLENGES / LIMITATIONS

• SoftServe suggested 3rd-party sensing hardware and data protocol to
address industry-speciﬁcs challenges
• SoftServe designed and deployed a hybrid edge and cloud data
processing model
• We built a real-time BI layer and analytics engine on large-scale data
streams

SOLUTION DESIGN
• SoftServe’s end solution focused on unsupervised anomaly detection to
help the end client identify observations that do not conform to the
expected behavioral patterns
ARCHITECTURAL DRIVERS
• Ingest and process multi-dimensional time series data from sensing equipment
(100-200GB per day)
• Calculate the key metrics and perform short- and long-term predictions in near
real-time (up to 5 mins)
• Continuously re-train the model when the new data comes in
• Initial training dataset consisted of ~300GB
• Support queries against historical data for analytics
ARCHITECTURAL DECISION [MODEL DEV]
1. Training Data Ingestion
• HDFS used as a storage layer
• Directory structure for data versioning
• Custom data conversion from the proprietary data protocol
ARCHITECTURAL DECISION [MODEL DEV]
2. Data Cleansing and Normalization
• Spark SQL and Dataframes for analytics
• Batch Spark jobs for data pre-processing
ARCHITECTURAL DECISION [MODEL DEV]
3. Feature Engineering
• Batch Spark job to calculate the features
• Selected features were stored in CrateDB and exposed via SQL

•
ARCHITECTURAL DECISION [MODEL DEV]
4. Model Training and Selection
• Spark ML for model training and tuning
• Yarn resource management
• No hardware acceleration were used
•
•

•
ARCHITECTURAL DECISION [MODEL DEV]
5. Model Persistence
• The result models were stored on HDFS
ARCHITECTURAL DECISION [MODEL SERVING]
1. New Data Ingestion
• Kafka used as a message broker to ingest the data from the sensors
ARCHITECTURAL DECISION [MODEL SERVING]
2. Data validation and Feature extraction
• Same batch transformations re-used in Spark Streaming
ARCHITECTURAL DECISION [MODEL SERVING]
3. Model Prediction
• Batch Spark ML jobs scheduled every 3 mins
ARCHITECTURAL DECISION [MODEL SERVING]
4. Serving Results
• The results saved back to CrateDB and exposed via Impala
• Zoomdata used to communicate the data and predictions

•
OUTCOMES
• Highly scalable distributed IoT platform
leveraging state-of-the-art Big Data and Cloud
technologies
• Real-time monitoring and user-centric BI
analytics
• Custom domain-speciﬁc self-learning anomaly
detection solution
SMART PARKING
SOLUTION
A SoftServe innovative solution provides
automatic parking space detection based
on a Computer Vision ML model.
A CCTV camera installed on a rooftop
captures images and the current parking
state is visualized in real-time via a web
application and LCD at the parking
entrance.
The solution can be used for both open and
authorized parking areas.
ARCHITECTURAL DRIVERS

• Deploy to the private on-premise infrastructure

• Perform real-time predictions over a video stream from the 4K IP camera
• Process 5 images per second for 121 parking spots
• Support on-demand re-training and re-deployment
• Initial training dataset consisted of 200,000+ images (SoftServe’s proprietary)
ARCHITECTURAL DECISION [MODEL DEV]
1. Training Data Ingestion
• NFS used as a storage layer for
training data
• Custom image labeling tool for
training data augmentation
ARCHITECTURAL DECISION [MODEL DEV]
2. Data Cleansing and
Normalization
• Custom image processing pipeline
written in Python (split image, lens
correction, color correction, contrast
and brightness correction etc.)
ARCHITECTURAL DECISION [MODEL DEV]
3. Feature Engineering
• Raw image data used for predictions
ARCHITECTURAL DECISION [MODEL DEV]
4. Model Training and
Selection
• TensorFlow/Python for model
training
• Containerized training jobs ran on a
VM and orchestrated by Ansible
ARCHITECTURAL DECISION [MODEL DEV]
5. Model Persistence
• The result models stored in a private
GIT repository (MS TFS)
• Ansible used to deploy a model as a
dockerized microservice
ARCHITECTURAL DECISION [MODEL SERVING]
1. New Data Ingestion
• Polling job transfers new images
from the edge device
ARCHITECTURAL DECISION [MODEL SERVING]
2. Data Validation and Feature
Extraction
• Same Python transformations
re-used in a Docker-based worker
service
ARCHITECTURAL DECISION [MODEL SERVING]
3. Model Prediction
• Dockerized RESTFul microservice
deployed to a VM
ARCHITECTURAL DECISION [MODEL SERVING]
4. Serving Results
• The results sent to RabbitMQ to
serve multiple components
PRACTICAL RECOMMENDATIONS

● Cloud native architectures based on containers, microservices and Kubernetes help

address ML-specific requirements for composability, portability, scalability
● Open source ML community has been working on multiple frameworks and libraries
that provide out-of-the-box ML lifecycle management capabilities (i.e. TFX, Kubeflow,
MLflow and Seldon)
● Tight cooperation between software engineering and ML teams helps significantly
speed up project development and maximize project success
CONCLUSIONS

● Architecting ML systems pose new challenges compared with the development of

traditional systems because the behaviour is not speciﬁed directly in code but is
learned from data.

● We need to architect to support initial model development and the continuous model
reﬁnement lifecycle.

● The model development and model serving workﬂow steps can be used as a
framework to guide design and to document design decisions.
QUESTIONS?

Humberto Cervantes [email protected]

Iurii Milovanov [email protected]
Rick Kazman [email protected]

Thank you!!
Machine learning process

MongoDB Sales Presentation
No ratings yet
MongoDB Sales Presentation
35 pages
Mckinsey On Finance No. 66
No ratings yet
Mckinsey On Finance No. 66
32 pages
Data Lake Bootcamp: Building Reliable Data Lakes
No ratings yet
Data Lake Bootcamp: Building Reliable Data Lakes
29 pages
De Mod 5 Deploy Workloads With Databricks Workflows
No ratings yet
De Mod 5 Deploy Workloads With Databricks Workflows
19 pages
Google People and Ai Guidebook-Workshop-Slides
No ratings yet
Google People and Ai Guidebook-Workshop-Slides
126 pages
Troubleshooting Palo Alto Docs Guide
100% (2)
Troubleshooting Palo Alto Docs Guide
502 pages
Saurav Dudulwar Resume
No ratings yet
Saurav Dudulwar Resume
1 page
Khelfaoui, Mounia - Sedkaoui, Soraya - Sharing Economy and Big Data Analytics-IsTE - John Wiley & Sons, Inc. (2020)
No ratings yet
Khelfaoui, Mounia - Sedkaoui, Soraya - Sharing Economy and Big Data Analytics-IsTE - John Wiley & Sons, Inc. (2020)
259 pages
Assignment 3 Questions PDF
No ratings yet
Assignment 3 Questions PDF
4 pages
Data As A Service The Future of Data Management
No ratings yet
Data As A Service The Future of Data Management
7 pages
Intro To ML
No ratings yet
Intro To ML
32 pages
1 - Optimize Amazon SageMaker Deployment Strategies
No ratings yet
1 - Optimize Amazon SageMaker Deployment Strategies
45 pages
Troubleshooting Spark Challenges
No ratings yet
Troubleshooting Spark Challenges
7 pages
Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
No ratings yet
Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
119 pages
320 Cohort 9 Report Final
No ratings yet
320 Cohort 9 Report Final
46 pages
Driverless A I Booklet
No ratings yet
Driverless A I Booklet
120 pages
Data Ready Ai
No ratings yet
Data Ready Ai
8 pages
Hemanshu Kumar Saraf - Resume New
No ratings yet
Hemanshu Kumar Saraf - Resume New
1 page
Nvidia-Learning-Training Course-Catalog
No ratings yet
Nvidia-Learning-Training Course-Catalog
27 pages
Generative AI Interview Questions and Answers
No ratings yet
Generative AI Interview Questions and Answers
7 pages
Donald Ngandeu 1
No ratings yet
Donald Ngandeu 1
6 pages
data engineering design patterns
No ratings yet
data engineering design patterns
53 pages
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
100% (4)
Instant Access To Data Lake Architecture Designing The Data Lake and Avoiding The Garbage Dump First Edition Bill Inmon Ebook Full Chapters
62 pages
McGrawHill_CompTIA_CySA_Cybersecurity_Analyst_Certification_Practice (2)
No ratings yet
McGrawHill_CompTIA_CySA_Cybersecurity_Analyst_Certification_Practice (2)
420 pages
Performance Computing
100% (1)
Performance Computing
102 pages
The Importance of AI in Supply Chain and Operations
No ratings yet
The Importance of AI in Supply Chain and Operations
24 pages
2023 Updated Huawei H12-711_V40-ENU Exam Dumps - PDF Room
No ratings yet
2023 Updated Huawei H12-711_V40-ENU Exam Dumps - PDF Room
27 pages
NorthStar Kick Off - Master
No ratings yet
NorthStar Kick Off - Master
78 pages
DuckDB in Action MEAP v02 Chptrs 1to4 MotheDuck
No ratings yet
DuckDB in Action MEAP v02 Chptrs 1to4 MotheDuck
123 pages
finops-new-approach-cloud-financial-management (1)
100% (1)
finops-new-approach-cloud-financial-management (1)
22 pages
Data Modeling and Erwin Day 4 Erwin
No ratings yet
Data Modeling and Erwin Day 4 Erwin
10 pages
Guide To The Parallel Universe
No ratings yet
Guide To The Parallel Universe
104 pages
Set Your Data in Motion
No ratings yet
Set Your Data in Motion
8 pages
Mining Your Data Lake For Analytics Insights v3 101420
No ratings yet
Mining Your Data Lake For Analytics Insights v3 101420
16 pages
Get Building Knowledge Graphs: A Practitioner's Guide 1st Edition Jesus Barrasa free all chapters
100% (4)
Get Building Knowledge Graphs: A Practitioner's Guide 1st Edition Jesus Barrasa free all chapters
50 pages
Analytics Engineering With SQL and dbt: Building Meaningful Data Models at Scale 1st Edition Rui Pedro Machado - Own the ebook now with all fully detailed chapters
100% (3)
Analytics Engineering With SQL and dbt: Building Meaningful Data Models at Scale 1st Edition Rui Pedro Machado - Own the ebook now with all fully detailed chapters
64 pages
Databricks Certified Data Engineer Associate Exam Guide
No ratings yet
Databricks Certified Data Engineer Associate Exam Guide
7 pages
KPM650 User Manual V2.2
No ratings yet
KPM650 User Manual V2.2
41 pages
DataEngineeringDatabricks
No ratings yet
DataEngineeringDatabricks
139 pages
AI Infrastructure Ecosystem 2022
No ratings yet
AI Infrastructure Ecosystem 2022
100 pages
(Module 2) Data Visualization Using Power Bi
No ratings yet
(Module 2) Data Visualization Using Power Bi
51 pages
Slide 13 - Kafka
No ratings yet
Slide 13 - Kafka
109 pages
02 - Introduction To Data Lakehouse Open-Source Technologies
No ratings yet
02 - Introduction To Data Lakehouse Open-Source Technologies
42 pages
DevOps - Fresher Training
No ratings yet
DevOps - Fresher Training
15 pages
Explainable AI: Analytics Summit, 13 June 2019
No ratings yet
Explainable AI: Analytics Summit, 13 June 2019
24 pages
Component Business Models - IBM
No ratings yet
Component Business Models - IBM
19 pages
Iot Merged
No ratings yet
Iot Merged
132 pages
Chatterjee I. Machine Learning and Its Application... Guide..2022
No ratings yet
Chatterjee I. Machine Learning and Its Application... Guide..2022
360 pages
CLOUD NOTES
No ratings yet
CLOUD NOTES
12 pages
AWS Certified AI Practitioner
0% (1)
AWS Certified AI Practitioner
2 pages
Graph Database Modeling With Neo4j
100% (1)
Graph Database Modeling With Neo4j
158 pages
Buy ebook Building an Event-Driven Data Mesh (Early Release) Adam Bellemare cheap price
100% (3)
Buy ebook Building an Event-Driven Data Mesh (Early Release) Adam Bellemare cheap price
40 pages
Llm-Based Chat
0% (1)
Llm-Based Chat
12 pages
CVPR2022 Tutorial Diffusion Model
No ratings yet
CVPR2022 Tutorial Diffusion Model
188 pages
21AML543 - Fundamentals of Data Science
No ratings yet
21AML543 - Fundamentals of Data Science
4 pages
Dzone Rc251 Gettingstartedwithtensorflow
No ratings yet
Dzone Rc251 Gettingstartedwithtensorflow
5 pages
Smart Traffic Management System Using IOT and Machine Learning Approach
No ratings yet
Smart Traffic Management System Using IOT and Machine Learning Approach
6 pages
Resentation@ Eclipse Iot Days Grenoble, April 28 2016: Gilles Privat, Orange Labs
100% (1)
Resentation@ Eclipse Iot Days Grenoble, April 28 2016: Gilles Privat, Orange Labs
40 pages
DP 900 Exam 1 50 PDF
No ratings yet
DP 900 Exam 1 50 PDF
52 pages
17 2017 Lecture1-2 INT312
0% (2)
17 2017 Lecture1-2 INT312
21 pages
Decentralized Web Platform - Public
No ratings yet
Decentralized Web Platform - Public
18 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
The Definitive Guide to Data Integration: Unlock the power of data integration to efficiently manage, transform, and analyze data
From Everand
The Definitive Guide to Data Integration: Unlock the power of data integration to efficiently manage, transform, and analyze data
Pierre-yves Bonnefoy
No ratings yet
Forest 4.0: Digitalization of Forest Using The Internet of Things (Iot)
100% (1)
Forest 4.0: Digitalization of Forest Using The Internet of Things (Iot)
13 pages
IEINews March24
No ratings yet
IEINews March24
20 pages
Production and Technology Management
No ratings yet
Production and Technology Management
93 pages
An IoT-Based Parking System To Prevent Unauthorized Vehicles
No ratings yet
An IoT-Based Parking System To Prevent Unauthorized Vehicles
10 pages
Iotppt Smart Irrigation Final
100% (1)
Iotppt Smart Irrigation Final
23 pages
Flow Meters: DFM Industrial 7/25 Operation Manual
No ratings yet
Flow Meters: DFM Industrial 7/25 Operation Manual
93 pages
SAP Business ByDesign Road Map
No ratings yet
SAP Business ByDesign Road Map
17 pages
21EC581 Manual
No ratings yet
21EC581 Manual
37 pages
Full Download Innovative Systems Approach for Designing Smarter World Toshiya Kaihara PDF DOCX
100% (3)
Full Download Innovative Systems Approach for Designing Smarter World Toshiya Kaihara PDF DOCX
65 pages
Design and Implementation of An Iot Based Firefighting and Affected Area Monitoring Robot
No ratings yet
Design and Implementation of An Iot Based Firefighting and Affected Area Monitoring Robot
6 pages
Literature Survey 2.1 Literature Survey
No ratings yet
Literature Survey 2.1 Literature Survey
14 pages
Lung Health Analyzer: Project By:-Samiksha Metha Komal Jadhav
No ratings yet
Lung Health Analyzer: Project By:-Samiksha Metha Komal Jadhav
15 pages
Wireless-Security-The-Internet-of-Evil-Things-2020
No ratings yet
Wireless-Security-The-Internet-of-Evil-Things-2020
8 pages
Exam Topics E Business
No ratings yet
Exam Topics E Business
54 pages
Hotcloud Hotstorage Slides Keynote
No ratings yet
Hotcloud Hotstorage Slides Keynote
47 pages
Investment Tracking Spreadsheet 18
No ratings yet
Investment Tracking Spreadsheet 18
34 pages
International Conference On Big Data, Iot and Machine Learning (Bim 2021)
No ratings yet
International Conference On Big Data, Iot and Machine Learning (Bim 2021)
1 page
Assessing Fine-Tuning Efficacy in LLMS: A Case Study With Learning Guidance Chatbots
No ratings yet
Assessing Fine-Tuning Efficacy in LLMS: A Case Study With Learning Guidance Chatbots
11 pages
Development of Smart Healthcare Monitoring System in Iot Environment
No ratings yet
Development of Smart Healthcare Monitoring System in Iot Environment
11 pages
IOT Based Pulse Oximeter Using Esp32 _ 9 Steps - Instructables
No ratings yet
IOT Based Pulse Oximeter Using Esp32 _ 9 Steps - Instructables
12 pages
Cyber Attack Scenarios in Smart Sity
No ratings yet
Cyber Attack Scenarios in Smart Sity
5 pages
Stock Diario Residencial 12-01-2022 - Mañana
No ratings yet
Stock Diario Residencial 12-01-2022 - Mañana
4 pages
Telkomsel Iot For Making Indonesia 4
No ratings yet
Telkomsel Iot For Making Indonesia 4
29 pages
Technologies For Green IoT
No ratings yet
Technologies For Green IoT
15 pages
Seminar Report: Raj Kumar Goel Institute of Technology, Ghaziabad
No ratings yet
Seminar Report: Raj Kumar Goel Institute of Technology, Ghaziabad
26 pages
28-WOS-IOT Security Data Encryption For Arduino-Based IOT Devices (F) - AL
No ratings yet
28-WOS-IOT Security Data Encryption For Arduino-Based IOT Devices (F) - AL
9 pages
3 - List of UG Subjects - CET SWAYAM-NPTEL - Credit Transfer
No ratings yet
3 - List of UG Subjects - CET SWAYAM-NPTEL - Credit Transfer
4 pages

Architecting To Support Machine Learning

Uploaded by

Architecting To Support Machine Learning

Uploaded by

Architecting to

Humberto Cervantes, UAM

Traditional Programming Machine learning

Computer Output Computer Model

Raw historical Transformation Model selection Trained

data transformation rules data to model development

Serving environment automatic

Shorter cycles of model development/reﬁnement are needed

Supporting initial model development and the

Identifying them is useful to perform the design of

DATA CLEANSING MODEL

Remote model: the model predicts/re-trains on the server side

Hybrid model predicts on client and re-trains on both (federated learning)

client machine model deltas server machine

DOMAIN-SPECIFIC TECHNOLOGY CHALLENGES / LIMITATIONS

• Deploy to the private on-premise infrastructure

● Cloud native architectures based on containers, microservices and Kubernetes help

● Architecting ML systems pose new challenges compared with the development of

Humberto Cervantes [email protected]

You might also like