AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide: The ultimate guide to passing the MLS-C01 exam on your first attempt
By Somanath Nanda and Weslley Moura
()
Related to AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide
Related ebooks
AWS Machine Learning Engineer Associate Complete Study Guide: 450+ Practice Questions with Real-World MLOps Projects for MLA-C01 Rating: 0 out of 5 stars0 ratingsApache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters Rating: 0 out of 5 stars0 ratingsMachine Learning Engineering Rating: 0 out of 5 stars0 ratingsSageMaker Essentials: Definitive Reference for Developers and Engineers Rating: 0 out of 5 stars0 ratingsEffective Amazon Machine Learning Rating: 0 out of 5 stars0 ratingsData Scientist Roadmap Rating: 5 out of 5 stars5/5AWS Certified AI Practitioner Complete Study Guide Foundational Exam Rating: 0 out of 5 stars0 ratingsOptimizing Machine Learning Pipelines: Advanced Techniques with TensorFlow and Kubeflow Rating: 0 out of 5 stars0 ratingsAWS Certified Machine Learning Associate Exam Study Guide Rating: 0 out of 5 stars0 ratingsDeep Learning for Time Series Cookbook: Use PyTorch and Python recipes for forecasting, classification, and anomaly detection Rating: 0 out of 5 stars0 ratingsMachine Learning with Python: A Comprehensive Guide with a Practical Example Rating: 0 out of 5 stars0 ratingsBuilding Production Machine Learning Pipelines with AWS SageMaker: The Complete Guide for Developers and Engineers Rating: 0 out of 5 stars0 ratingsPython Machine Learning By Example Rating: 4 out of 5 stars4/5Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise Rating: 0 out of 5 stars0 ratingsData Analytics in the AWS Cloud: Building a Data Platform for BI and Predictive Analytics on AWS Rating: 0 out of 5 stars0 ratingsContemporary Machine Learning Methods: Harnessing Scikit-Learn and TensorFlow Rating: 0 out of 5 stars0 ratingsMachine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure Rating: 0 out of 5 stars0 ratingsPython Machine Learning By Example: Unlock machine learning best practices with real-world use cases Rating: 0 out of 5 stars0 ratingsCloud-Based Machine Learning Rating: 0 out of 5 stars0 ratingsData-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data Rating: 0 out of 5 stars0 ratingsLead With AI: Igniting Company Growth with Artificial Intelligence Rating: 0 out of 5 stars0 ratingsMachine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition) Rating: 0 out of 5 stars0 ratingsFundamentals of Machine Learning: An Introduction to Neural Networks Rating: 0 out of 5 stars0 ratingsMastering Deep Learning with Keras: From Basics to Expert Proficiency Rating: 0 out of 5 stars0 ratingsHands-on TinyML: Harness the power of Machine Learning on the edge devices (English Edition) Rating: 5 out of 5 stars5/5AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam Rating: 0 out of 5 stars0 ratings
Certification Guides For You
CompTIA Security+ Study Guide: Exam SY0-601 Rating: 5 out of 5 stars5/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Coding For Dummies Rating: 5 out of 5 stars5/5CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5CompTIA A+ Complete Review Guide: Core 1 Exam 220-1101 and Core 2 Exam 220-1102 Rating: 5 out of 5 stars5/5How to Get Started as a Technical Writer Rating: 4 out of 5 stars4/5(ISC)2 CISSP Certified Information Systems Security Professional Official Study Guide Rating: 3 out of 5 stars3/5CompTIA A+ Complete Study Guide: Exam Core 1 220-1001 and Exam Core 2 220-1002 Rating: 4 out of 5 stars4/5CompTIA Security+ Study Guide with over 500 Practice Test Questions: Exam SY0-701 Rating: 5 out of 5 stars5/5CISM Certified Information Security Manager Study Guide Rating: 4 out of 5 stars4/5SSCP (ISC)2 Systems Security Certified Practitioner Official Study Guide Rating: 0 out of 5 stars0 ratingsCompTIA Network+ Study Guide: Exam N10-009 Rating: 0 out of 5 stars0 ratingsCompTIA Network+ Review Guide: Exam N10-008 Rating: 0 out of 5 stars0 ratingsThe Official (ISC)2 CCSP CBK Reference Rating: 0 out of 5 stars0 ratingsCCNA Certification Study Guide Volume 1: Exam 200-301 v1.1 Rating: 5 out of 5 stars5/5Microsoft Office 365 for Business Rating: 4 out of 5 stars4/5AWS Certified Cloud Practitioner: Study Guide with Practice Questions and Labs Rating: 5 out of 5 stars5/5CISSP Official (ISC)2 Practice Tests Rating: 5 out of 5 stars5/5CCNA Certification Study Guide, Volume 2: Exam 200-301 Rating: 5 out of 5 stars5/5CompTIA A+ Certification All-in-One Study Guide: Exams (Core 1: 220-1101 and Core 2: 220-1102) Rating: 0 out of 5 stars0 ratingsCompTIA Data+ Study Guide: Exam DA0-001 Rating: 0 out of 5 stars0 ratingsCompTIA A+ CertMike: Prepare. Practice. Pass the Test! Get Certified!: Core 1 Exam 220-1101 Rating: 0 out of 5 stars0 ratingsMS-900: Microsoft 365 Fundamentals Practice Questions First Edition Rating: 5 out of 5 stars5/5Thinking Beyond Coding Rating: 5 out of 5 stars5/5CompTIA A+ Complete Practice Tests: Exam Core 1 220-1001 and Exam Core 2 220-1002 Rating: 0 out of 5 stars0 ratingsPMP Project Management Professional Exam Study Guide: 2021 Exam Update Rating: 4 out of 5 stars4/5CISSP For Dummies Rating: 4 out of 5 stars4/5
Reviews for AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide
0 ratings0 reviews
Book preview
AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide - Somanath Nanda
AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide
Second Edition
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Authors: Somanath Nanda and Weslley Moura
Reviewer: Patrick Uzuwe
Publishing Product Manager: Sneha Shinde
Senior-Development Editor: Ketan Giri
Development Editor: Kalyani S.
Presentation Designer: Shantanu Zagade
Editorial Board: Vijin Boricha, Megan Carlisle, Wilson D'souza, Ketan Giri, Saurabh Kadave, Alex Mazonowicz, Abhishek Rane, Gandhali Raut, and Ankita Thakur
First Published: March 2021
Second Edition: February 2024
Production Reference: 1280224
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB
ISBN: 978-1-83508-220-1
www.packtpub.com
Contributors
About the Authors
Somanath Nanda has 14 years of experience designing and building Data and ML products. He emphasizes implementing fault-tolerant system design practices throughout the software development lifecycle. He currently holds a prominent leadership position in the finance domain, actively shaping strategic decisions and executions and expertly guiding engineering teams to achieve success.
Weslley Moura has been developing data products for the past decade. At his recent roles, he has been influencing data strategy and leading data teams into the urban logistics and blockchain industries.
About the Reviewer
Patrick Uzuwe serves as the Chief Technology Officer (CTO) at Sparkrena, a company based in Sheffield, England, United Kingdom. In this role, he specializes in assisting customers in the design and development of cloud-native machine learning products. Driven by a passion for solving challenging problems, he collaborates with partners and customers to modernize their machine learning stack, integrating seamlessly with Amazon SageMaker. Dr. Uzuwe actively works alongside both business and engineering teams to ensure the success of products.
His academic background includes a Ph.D. in Information Systems, which he earned from The University of Bolton in Manchester, United Kingdom.
Table of Contents
Preface
1
Machine Learning Fundamentals
Making The Most Out of This Book – Your Certification and Beyond
Comparing AI, ML, and DL
Examining ML
Examining DL
Classifying supervised, unsupervised, and reinforcement learning
Introducing supervised learning
The CRISP-DM modeling life cycle
Data splitting
Overfitting and underfitting
Applying cross-validation and measuring overfitting
Bootstrapping methods
The variance versus bias trade-off
Shuffling your training set
Modeling expectations
Introducing ML frameworks
ML in the cloud
Summary
Exam Readiness Drill – Chapter Review Questions
2
AWS Services for Data Storage
Technical requirements
Storing Data on Amazon S3
Creating buckets to hold data
Distinguishing between object tags and object metadata
Controlling access to buckets and objects on Amazon S3
S3 bucket policy
Protecting data on Amazon S3
Applying bucket versioning
Applying encryption to buckets
Securing S3 objects at rest and in transit
Using other types of data stores
Relational Database Service (RDS)
Managing failover in Amazon RDS
Taking automatic backups, RDS snapshots, and restore and read replicas
Writing to Amazon Aurora with multi-master capabilities
Storing columnar data on Amazon Redshift
Amazon DynamoDB for NoSQL Database-as-a-Service
Summary
Exam Readiness Drill – Chapter Review Questions
3
AWS Services for Data Migration and Processing
Technical requirements
Creating ETL jobs on AWS Glue
Features of AWS Glue
Getting hands-on with AWS Glue Data Catalog components
Getting hands-on with AWS Glue ETL components
Querying S3 data using Athena
Processing real-time data using Kinesis Data Streams
Storing and transforming real-time data using Kinesis Data Firehose
Different ways of ingesting data from on-premises into AWS
AWS Storage Gateway
Snowball, Snowball Edge, and Snowmobile
AWS DataSync
AWS Database Migration Service
Processing stored data on AWS
AWS EMR
AWS Batch
Summary
Exam Readiness Drill – Chapter Review Questions
4
Data Preparation and Transformation
Identifying types of features
Dealing with categorical features
Transforming nominal features
Applying binary encoding
Transforming ordinal features
Avoiding confusion in our train and test datasets
Dealing with numerical features
Data normalization
Data standardization
Applying binning and discretization
Applying other types of numerical transformations
Understanding data distributions
Handling missing values
Dealing with outliers
Dealing with unbalanced datasets
Dealing with text data
Bag of words
TF-IDF
Word embedding
Summary
Exam Readiness Drill – Chapter Review Questions
5
Data Understanding and Visualization
Visualizing relationships in your data
Visualizing comparisons in your data
Visualizing distributions in your data
Visualizing compositions in your data
Building key performance indicators
Introducing QuickSight
Summary
Exam Readiness Drill – Chapter Review Questions
6
Applying Machine Learning Algorithms
Introducing this chapter
Storing the training data
A word about ensemble models
Supervised learning
Working with regression models
Working with classification models
Forecasting models
Object2Vec
Unsupervised learning
Clustering
Anomaly detection
Dimensionality reduction
IP Insights
Textual analysis
BlazingText algorithm
Sequence-to-sequence algorithm
Neural Topic Model algorithm
Image processing
Image classification algorithm
Semantic segmentation algorithm
Object detection algorithm
Summary
Exam Readiness Drill – Chapter Review Questions
7
Evaluating and Optimizing Models
Introducing model evaluation
Evaluating classification models
Extracting metrics from a confusion matrix
Summarizing precision and recall
Evaluating regression models
Exploring other regression metrics
Model optimization
Grid search
Summary
Exam Readiness Drill – Chapter Review Questions
8
AWS Application Services for AI/ML
Technical requirements
Analyzing images and videos with Amazon Rekognition
Exploring the benefits of Amazon Rekognition
Getting hands-on with Amazon Rekognition
Text to speech with Amazon Polly
Exploring the benefits of Amazon Polly
Getting hands-on with Amazon Polly
Speech to text with Amazon Transcribe
Exploring the benefits of Amazon Transcribe
Getting hands-on with Amazon Transcribe
Implementing natural language processing with Amazon Comprehend
Exploring the benefits of Amazon Comprehend
Getting hands-on with Amazon Comprehend
Translating documents with Amazon Translate
Exploring the benefits of Amazon Translate
Getting hands-on with Amazon Translate
Extracting text from documents with Amazon Textract
Exploring the benefits of Amazon Textract
Getting hands-on with Amazon Textract
Creating chatbots on Amazon Lex
Exploring the benefits of Amazon Lex
Getting hands-on with Amazon Lex
Amazon Forecast
Exploring the benefits of Amazon Forecast
Sales Forecasting Model with Amazon Forecast
Summary
Exam Readiness Drill – Chapter Review Questions
9
Amazon SageMaker Modeling
Technical requirements
Creating notebooks in Amazon SageMaker
What is Amazon SageMaker?
Training Data Location and Formats
Getting hands-on with Amazon SageMaker notebook instances
Getting hands-on with Amazon SageMaker’s training and inference instances
Model tuning
Tracking your training jobs and selecting the best model
Choosing instance types in Amazon SageMaker
Choosing the right instance type for a training job
Choosing the right instance type for an inference job
Taking care of Scalability Configurations
Scaling Policy Overview
Scale Based on a Schedule
Minimum and Maximum Scaling Limits
Cooldown Period
Securing SageMaker notebooks
SageMaker Debugger
SageMaker Autopilot
SageMaker Model Monitor
SageMaker Training Compiler
SageMaker Data Wrangler
SageMaker Feature Store
SageMaker Edge Manager
SageMaker Canvas
Summary
Exam Readiness Drill – Chapter Review Questions
10
Model Deployment
Factors influencing model deployment options
SageMaker deployment options
Real-time endpoint deployment
Batch transform job
Multi-model endpoint deployment
Endpoint autoscaling
Serverless APIs with AWS Lambda and SageMaker
Creating alternative pipelines with Lambda Functions
Creating and configuring a Lambda Function
Completing your configurations and deploying a Lambda function
Working with step functions
Scaling applications with SageMaker deployment and AWS Autoscaling
Scenario 1 – Fluctuating inference workloads
Scenario 2 – The batch processing of large datasets
Scenario 3 – A multi-model endpoint with dynamic traffic
Scenario 4 – Continuous Model Monitoring with drift detection
Securing SageMaker applications
Summary
Exam Readiness Drill – Chapter Review Questions
11
Accessing the Online Practice Resources
Other Books You May Enjoy
Preface
The AWS Machine Learning Specialty certification exam tests your competency to perform machine learning (ML) on AWS infrastructure. This book covers the entire exam syllabus in depth using practical examples to help you with your real-world ML projects on AWS.
Starting with an introduction to ML on AWS, you will learn the fundamentals of ML and explore important AWS services for artificial intelligence (AI). You will then see how to store and process data for ML using several AWS services, such as S3 and EMR.
You will also learn how to prepare data for ML and discover different techniques for data manipulation and transformation for different types of variables. The book covers the handling of missing data and outliers and takes you through various ML tasks, such as classification, regression, clustering, forecasting, anomaly detection, text mining, and image processing, along with their specific ML algorithms, that you need to know in order to pass the exam. Finally, you will explore model evaluation, optimization, and deployment and get to grips with deploying models in a production environment and monitoring them.
By the end of the book, you will have gained knowledge of all the key fields of ML and the solutions that AWS has released for each of them, along with the tools, methods, and techniques commonly used in each domain of AWS ML. This book is not only intended to support you in the AWS Machine Learning Specialty certification exam but also to make your ML professional journey a lot easier.
Who This Book Is for
This book is designed for both students and professionals preparing for the AWS Certified Machine Learning Specialty exam or enhance their understanding of machine learning, with a specific emphasis on AWS. Familiarity with machine learning basics and AWS services is recommended to fully benefit from this book.
What This Book Covers
Chapter 1, Machine Learning Fundamentals, covers some ML definitions, different types of modeling approaches, and all the steps necessary to build an ML product.
Chapter 2, AWS Services for Data Storage, teaches you about the AWS services used to store data for ML. You will learn about the many different S3 storage classes and when to use each of them. You will also learn how to handle data encryption and how to secure your data at rest and in transit. Finally, you will learn about other types of data store services that are also worth knowing for the exam.
Chapter 3, AWS Services for Data Migration and Processing, teaches you about the AWS services used to process data for ML. You will learn how to deal with batch and real-time processing, how to directly query data on Amazon S3, and how to create big data applications on EMR.
Chapter 4, Data Preparation and Transformation, deals with categorical and numerical features and applying different techniques to transform your data, such as one-hot encoding, binary encoding, ordinal encoding, binning, and text transformations. You will also learn how to handle missing values and outliers in your data, two important topics for building good ML models.
Chapter 5, Data Understanding and Visualization, teaches you how to select the most appropriate data visualization technique according to different variable types and business needs. You will also learn about AWS services for visualizing data.
Chapter 6, Applying Machine Learning Algorithms, covers different types of ML tasks, such as classification, regression, clustering, forecasting, anomaly detection, text mining, and image processing. Each of these tasks has specific algorithms that you should know about to pass the exam. You will also learn how ensemble models work and how to deal with the curse of dimensionality.
Chapter 7, Evaluating and Optimizing Models, teaches you how to select model metrics to evaluate model results. You will also learn how to optimize your model by tuning its hyperparameters.
Chapter 8, AWS Application Services for AI/ML, covers details of the various AI/ML applications offered by AWS that you need to know about to pass the exam.
Chapter 9, Amazon SageMaker Modeling, teaches you how to spin up notebooks to work with exploratory data analysis and how to train your models on Amazon SageMaker. You will learn where and how your training data should be stored in order to make it accessible through SageMaker and explore the different data formats that you can use.
Chapter 10, Model Deployment, teaches you about several AWS model deployment options. You will review SageMaker deployment options, creating alternative pipelines with Lambda functions, working with Step Functions, configuring auto scaling, and securing SageMaker applications.
How to Use This Book
This AWS Certified Machine Learning Specialty study guide explains each concept from the exam syllabus using realistic examples and comprehensive theoretical notes. The book is your go-to resource for acing the AWS Certified Machine Learning Specialty exam with confidence.
Online Practice Resources
With this book, you will unlock unlimited access to our online exam-prep platform (Figure 0.1). This is your place to practice everything you learn in the book.
How to access the resources
To learn how to access the online resources, refer to Chapter 11, Accessing the Online Practice Resources at the end of this book.
Figure 0.1 – Online exam-prep platform on a desktop deviceFigure 0.1 – Online exam-prep platform on a desktop device
Sharpen your knowledge of MLS-C01 concepts with multiple sets of mock exams, interactive flashcards, and exam tips accessible from all modern web browsers.
Download the Color Images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/ky8E8.
Conventions Used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: You will use the detect_labels API from Amazon Rekognition in the code.
A block of code is set as follows:
from sagemaker.predictor import Predictor predictor = Predictor(endpoint_name='your-endpoint-name', sagemaker_session=sagemaker_session) predictor.predict('input_data')
Any command-line input or output is written as follows:
sh-4.2$ cd ~/SageMaker/ sh-4.2$ git clone https://github.com/PacktPublishing/ AWS-Certified-Machine-Learning-Specialty-MLS-C01- Certification-Guide-Second-Edition.git
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: In CloudWatch, each Lambda function will have a log group and, inside that log group, many log streams.
Tips or important notes
Appear like this.
Get in Touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details. We ensure that all valid errata are promptly updated in the GitHub repository, with the relevant information available in the Readme.md file. You can access the GitHub repository: https://packt.link/QFk6t.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Share Your Thoughts
Once you’ve read AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Download a Free PDF Copy of This Book
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily.
Follow these simple steps to get the benefits:
Scan the QR code or visit the link below:
https://packt.link/free-ebook/9781835082201https://packt.link/free-ebook/9781835082201
Submit your proof of purchase.
That’s it! You’ll send your free PDF and other benefits to your email directly.
1
Machine Learning Fundamentals
For many decades, researchers have been trying to simulate human brain activity through the field known as artificial intelligence, or AI for short. In 1956, a group of people met at the Dartmouth Summer Research Project on Artificial Intelligence, an event that is widely accepted as the first group discussion about AI as it’s known today. Researchers were trying to prove that many aspects of the learning process could be precisely described and, therefore, automated and replicated by a machine. Today, you know they were right!
Many other terms appeared in this field, such as machine learning (ML) and deep learning (DL). These sub-areas of AI have also been evolving for many decades (granted, nothing here is new to the science). However, with the natural advance of the information society and, more recently, the advent of big data platforms, AI applications have been reborn with much more applicability – power (because now there are more computational resources to simulate and implement them) and applicability (because now information is everywhere).
Even more recently, cloud service providers have put AI in the cloud. This helps all sizes of companies to reduce their operational costs and even lets them sample AI applications, considering that it could be too costly for a small company to maintain its own data center to scale an AI application.
An incredible journey of building cutting-edge AI applications has emerged with the popularization of big data and cloud services. In June 2020, one specific technology gained significant attention and put AI on the list of the most discussed topics across the technology industry – its name is ChatGPT.
ChatGPT is a popular AI application that uses large language models (more specifically, generative pre-trained transformers) trained on massive amounts of text data to understand and generate human-like language. These models are designed to process and comprehend the complexities of human language, including grammar, context, and semantics.
Large language models utilize DL techniques (for example, deep neural networks based on transformer architecture) to learn patterns and relationships within textual data. They consist of millions of parameters, making them highly complex and capable of capturing very specific language structures.
Such mixing of terms and different classes of use cases might get one stuck on understanding the practical steps of implementing AI applications. That brings you to the goal of this chapter: being able to describe what the terms AI, ML, and DL mean, as well as understanding all the nuances of an ML pipeline. Avoiding confusion about these terms and knowing what exactly an ML pipeline is will allow you to properly select your services, develop your applications, and master the AWS Machine Learning Specialty exam.
Making The Most Out of This Book – Your Certification and Beyond
This book and its accompanying online resources are designed to be a complete preparation tool for your MLS-C01 Exam.
The book is written in a way that you can apply everything you’ve learned here even after your certification. The online practice resources that come with this book (Figure 1.1) are designed to improve your test-taking skills. They are loaded with timed mock exams, interactive flashcards, and exam tips to help you work on your exam readiness from now till your test day.
Before You Proceed
To learn how to access these resources, head over to Chapter 14, Accessing the Online Practice Resources, at the end of the book.
Figure 1.1 – Dashboard interface of the online practice resourcesFigure 1.1 – Dashboard interface of the online practice resources
Here are some tips on how to make the most out of this book so that you can clear your certification and retain your knowledge beyond your exam:
Read each section thoroughly.
Make ample notes: You can use your favorite online note-taking tool or use a physical notebook. The free online resources also give you access to an online version of this book. Click the BACK TO THE BOOK link from the Dashboard to access the book in Packt Reader. You can highlight specific sections of the book there.
Chapter Review Questions: At the end of this chapter, you’ll find a link to review questions for this chapter. These are designed to test your knowledge of the chapter. Aim to score at least 75% before moving on to the next chapter. You’ll find detailed instructions on how to make the most of these questions at the end of this chapter in the Exam Readiness Drill - Chapter Review Questions section. That way, you’re improving your exam-taking skills after each chapter, rather than at the end.
Flashcards: After you’ve gone through the book and scored 75% more in each of the chapter review questions, start reviewing the online flashcards. They will help you memorize key concepts.
Mock Exams: Solve the mock exams that come with the book till your exam day. If you get some answers wrong, go back to the book and revisit the concepts you’re weak in.
Exam Tips: Review these from time to time to improve your exam readiness even further.
The main topics of this chapter are as follows:
Comparing AI, ML, and DL
Classifying supervised, unsupervised, and reinforcement learning
The CRISP-DM modeling life cycle
Data splitting
Modeling expectations
Introducing ML frameworks
ML in the cloud
Comparing AI, ML, and DL
AI is a broad field that studies different ways to create systems and machines that will solve problems by simulating human intelligence. There are different levels of sophistication to create these programs and machines, which go from simple rule-based engines to complex self-learning systems. AI covers, but is not limited to, the following sub-areas:
Robotics
Natural language processing (NLP)
Rule-based systems
Machine learning (ML)
Computer vision
The area this certification exam focuses on is ML.
Examining ML
ML is a sub-area of AI that aims to create systems and machines that can learn from experience, without being explicitly programmed. As the name suggests, the system can observe its underlying environment, learn, and adapt itself without human intervention. Algorithms behind ML systems usually extract and improve knowledge from the data and conditions that are available to them.
Figure 1.2 – Hierarchy of AI, ML, and DLFigure 1.2 – Hierarchy of AI, ML, and DL
You should keep in mind that there are different classes of ML algorithms. For example, decision tree-based models, probabilistic-based models, and neural network models. Each of these classes might contain dozens of specific algorithms or architectures (some of them will be covered in later sections of this book).
As you might have noticed in Figure 1.2, you can be even more specific and break the ML field down into another very important topic for the Machine Learning Specialty exam: deep learning, or DL for short.
Examining DL
DL is a subset of ML that aims to propose algorithms that connect multiple layers to solve a particular problem. The knowledge is then passed through, layer by layer, until the optimal solution is found. The most common type of DL algorithm is deep neural networks.
At the time of writing this book, DL is a very hot topic in the field of ML. Most of the current state-of-the-art algorithms for machine translation, image captioning, and computer vision were proposed in the past few years and are a part of the DL field (GPT-4, used by the ChatGPT application, is one of these algorithms).
Now that you have an overview of types of AI, take a look at some of the ways you can classify ML.
Classifying supervised, unsupervised, and reinforcement learning
ML is a very extensive field of study; that’s why it is very important to have a clear definition of its sub-divisions. From a very broad perspective, you can split ML algorithms into two main classes: supervised learning and unsupervised learning.
Introducing supervised learning
Supervised algorithms use a class or label (from the input data) as support to find and validate the optimal solution. In Table 1.1, there is a dataset that aims to classify fraudulent transactions from a financial company.
Table 1.1 – Sample dataset for supervised learning
The first four