0% found this document useful (0 votes)
43 views

AutoML Tools

The document compares three machine learning platforms: Azure AutoML, GCP AutoML, and AWS Sagemaker. Azure AutoML works for structured data only and allows users to choose between classification, regression, or forecasting tasks. It supports various algorithms and metrics. Models can be deployed and accessed via API. Feature importance can be viewed. GCP AutoML works for vision, language, and structured data tasks. It supports various use cases and has manual/automated labeling. Models can be deployed and accessed via API. Feature importance and metrics can be viewed. AWS Sagemaker supports regression, classification, and time series forecasting. It implements various algorithms. Models can be deployed and accessed via API. Feature importance

Uploaded by

Aayush Chaudhry
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

AutoML Tools

The document compares three machine learning platforms: Azure AutoML, GCP AutoML, and AWS Sagemaker. Azure AutoML works for structured data only and allows users to choose between classification, regression, or forecasting tasks. It supports various algorithms and metrics. Models can be deployed and accessed via API. Feature importance can be viewed. GCP AutoML works for vision, language, and structured data tasks. It supports various use cases and has manual/automated labeling. Models can be deployed and accessed via API. Feature importance and metrics can be viewed. AWS Sagemaker supports regression, classification, and time series forecasting. It implements various algorithms. Models can be deployed and accessed via API. Feature importance

Uploaded by

Aayush Chaudhry
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

AZURE AUTOML

 Takes only csv or tsv data; works for structured data only
 Then choose the type of prediction task : classification, regression, forecasting
 Then choose the metric : accuracy, AUC, norm_macro_recall ,avg_precision_score,
precision_score
 Then set training time and number of iterations
 Then choose validation type : K-fold cross validation , monte carlo cross validation
 Then choose algorithms to block; some algorithms listed are – logistic regression, SGD, naive
bayes, SVM , KNN, decision trees, random forest, gradient boosting etc
 Can be deployed, sets up a HTTPS endpoint and can be used via API calls for inference
 Can generate feature importance, must provide a validation dataset (X_valid) to get
feature importance, for documentation refer : https://docs.microsoft.com/en-
us/azure/machine-learning/service/how-to-configure-auto-train#explain-the-model-
interpretability
 Feature importance can be accessed via command line or azure portal
 Reference link for understanding automl : https://docs.microsoft.com/en-
us/azure/machine-learning/service/concept-automated-ml

GCP AUTOML

 Works for
o vision, including images and video
o language models, revealing structure and meaning to text and translation
o structured data
 following are all use cases
o Natural language classification
o Natural language entity extraction
o Natural language sentiment analysis
o Tables
o Translation
o Video intelligence classification
o Video object tracking
o Vision classification
o Vision edge
o Vision object detection
 Has a manual/automated labelling service
 Deployed and can be used via a REST API for inference
 Feature importance can be seen
 The evaluation metrics can either be accessed via GCP console or command line

AWS Sagemaker

 Can be deployed, sets up a HTTPS endpoint and can be used via API calls for inference
 Can be used for
o Regression
o Classification
o Time series forecasting
o Recommendations
 Feature importance can be accessed via command line
 Following algorithms implemented in AWS Sagemaker
o BlazingText Word2Vec- BlazingText implementation of the Word2Vec algorithm for
scaling and accelerating the generation of word embeddings from a large number of
documents.
o DeepAR - An algorithm that generates accurate forecasts by learning patterns from
many related time-series using recurrent neural networks (RNN).
o Factorization Machines - A model with the ability to the estimate all of the
interactions between features even with a very small amount of data.
o Gradient Boosted Trees (XGBoost) - Short for “Extreme Gradient Boosting”, XGBoost
is an optimized distributed gradient boosting library.
o Image Classification (ResNet) - A popular neural network for developing image
classification systems.
o IP Insights - An algorithm to detect malicious users or learn to usage patterns of IP
addresses.
o K-Means Clustering - One of the simplest ML algorithms. It’s used to find groups
within unlabeled data.
o K-Nearest Neighbor (k-NN) - An index based algorithm to address classification and
regression based problems.
o Latent Dirichlet Allocation (LDA) - A model that is well suited to automatically
discovering the main topics present in a set of text files.
o Linear Learner (Classification) - Linear classification uses an object’s characteristics
to identify the appropriate group that it belongs to.
o Linear Learner (Regression) - Linear regression is used to predict the linear
relationship between two variables.
o Neural Topic Modelling (NTM) - A neural network based approach for learning topics
from text and image datasets.
o Object2Vec - A neural-embedding algorithm to compute nearest neighbors and to
visualize natural clusters.
o Object Detection - Detects, classifies, and places bounding boxes around
multiple objects in an image.
o Principal Component Analysis (PCA) - Often used in data pre-processing, this
algorithm takes a table or matrix of many features and reduces it to a smaller
number of representative features.
o Random Cut Forest - An unsupervised machine learning algorithm for anomaly
detection.
o Semantic Segmentation - Partitions an image to identify places of interest by
assigning a label to the individual pixels of the image.
o Seqence2Sequence - A general-purpose encoder-decoder for text that is often
used for machine translation, text summarization, etc.

You might also like