Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
()
Related to Active Machine Learning with Python
Related ebooks
Contemporary Machine Learning Methods: Harnessing Scikit-Learn and TensorFlow Rating: 0 out of 5 stars0 ratingsPython Automation Mastery: From Novice To Pro Rating: 0 out of 5 stars0 ratingsPython Machine Learning By Example Rating: 4 out of 5 stars4/5Applied Machine Learning with Scikit-learn: Definitive Reference for Developers and Engineers Rating: 0 out of 5 stars0 ratingsArtificial Intelligence: Evolution and Revolution Rating: 0 out of 5 stars0 ratingsComprehensive Machine Learning Techniques: A Guide for the Experienced Analyst Rating: 0 out of 5 stars0 ratingsBeyond The Algorithm: Practical Machine Learning Strategies Rating: 0 out of 5 stars0 ratingsPython Machine Learning By Example: Unlock machine learning best practices with real-world use cases Rating: 0 out of 5 stars0 ratingsMachine Learning For Dummies Rating: 4 out of 5 stars4/5Data-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data Rating: 0 out of 5 stars0 ratingsMachine Learning Essentials You Always Wanted to Know: Self Learning Management Rating: 0 out of 5 stars0 ratingsApplied Machine Learning with MLlib: Definitive Reference for Developers and Engineers Rating: 0 out of 5 stars0 ratingsPython Feature Engineering Cookbook: A complete guide to crafting powerful features for your machine learning models Rating: 0 out of 5 stars0 ratingsMachine Learning with Python: Foundations and Applications: ML, #1 Rating: 0 out of 5 stars0 ratingsCracking the Code: Building a Foundation for Artificial Intelligence Rating: 0 out of 5 stars0 ratingsMachine Learning with Python: A Comprehensive Guide with a Practical Example Rating: 0 out of 5 stars0 ratingsScikit-Learn Unleashed: A Comprehensive Guide to Machine Learning with Python Rating: 0 out of 5 stars0 ratingsFundamentals of Machine Learning: An Introduction to Neural Networks Rating: 0 out of 5 stars0 ratingsPython Machine Learning Illustrated Guide For Beginners & Intermediates: The Future Is Here! Rating: 5 out of 5 stars5/5Apache Spark for Machine Learning: Build and deploy high-performance big data AI solutions for large-scale clusters Rating: 0 out of 5 stars0 ratingsThe Fundamentals of Machine Learning: Building Intelligent Systems from Data Rating: 0 out of 5 stars0 ratingsPython for AI: Applying Machine Learning in Everyday Projects Rating: 0 out of 5 stars0 ratingsHands-on ML Projects with OpenCV: Master Computer Vision and Machine Learning using OpenCV and Python Rating: 0 out of 5 stars0 ratingsBeginner's Guide to Machine Learning Concepts Rating: 0 out of 5 stars0 ratings
Intelligence (AI) & Semantics For You
Writing AI Prompts For Dummies Rating: 0 out of 5 stars0 ratingsArtificial Intelligence: A Guide for Thinking Humans Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION Rating: 5 out of 5 stars5/5The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/580 Ways to Use ChatGPT in the Classroom Rating: 5 out of 5 stars5/5Generative AI For Dummies Rating: 2 out of 5 stars2/53550+ Most Effective ChatGPT Prompts Rating: 0 out of 5 stars0 ratingsChatGPT Millionaire: Work From Home and Make Money Online, Tons of Business Models to Choose from Rating: 5 out of 5 stars5/5AI for Educators: AI for Educators Rating: 3 out of 5 stars3/5Midjourney Mastery - The Ultimate Handbook of Prompts Rating: 5 out of 5 stars5/5The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions Rating: 4 out of 5 stars4/5Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures Rating: 4 out of 5 stars4/5Demystifying Prompt Engineering: AI Prompts at Your Fingertips (A Step-By-Step Guide) Rating: 4 out of 5 stars4/5The ChatGPT Revolution: How to Simplify Your Work and Life Admin with AI Rating: 0 out of 5 stars0 ratings100M Offers Made Easy: Create Your Own Irresistible Offers by Turning ChatGPT into Alex Hormozi Rating: 0 out of 5 stars0 ratingsThe Roadmap to AI Mastery: A Guide to Building and Scaling Projects Rating: 3 out of 5 stars3/5AI Money Machine: Unlock the Secrets to Making Money Online with AI Rating: 5 out of 5 stars5/5A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®) Rating: 4 out of 5 stars4/5Coding with AI For Dummies Rating: 1 out of 5 stars1/5AI Investing For Dummies Rating: 0 out of 5 stars0 ratingsChatGPT Rating: 3 out of 5 stars3/5
Reviews for Active Machine Learning with Python
0 ratings0 reviews
Book preview
Active Machine Learning with Python - Margaux Masson-Forsythe
Active Machine Learning with Python
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Tejashwini R
Book Project Manager: Kirti Pisat
Senior Editor: Vandita Grover
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Vijay Kamble
DevRel Marketing Coordinator: Vinishka Kalra
First published: March 2024
Production reference: 1270324
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-83546-494-6
www.packtpub.com
To my beloved wife, Heather Masson-Forsythe, whose unwavering kindness and support are my pillars of strength with every new intense project I undertake each week.
Contributors
About the author
Margaux Masson-Forsythe is a skilled machine learning engineer and advocate for advancements in surgical data science and climate AI. As the director of machine learning at Surgical Data Science Collective, she builds computer vision models to detect surgical tools in videos and track procedural motions. Masson-Forsythe manages a multidisciplinary team and oversees model implementation, data pipelines, infrastructure, and product delivery. With a background in computer science and expertise in machine learning, computer vision, and geospatial analytics, she has worked on projects related to reforestation, deforestation monitoring, and crop yield prediction.
About the reviewer
Mourya Boggarapa is a deep learning software engineer specializing in the end-to-end integration of large language models for custom AI accelerators. He holds a master’s degree in software engineering from Carnegie Mellon University. Prior to his current role, Mourya honed his skills through diverse experiences: developing backend systems for a major bank, building development infrastructure for a tech giant, and some mobile app development. He cultivated a comprehensive understanding of software development across various domains. His primary passion lies in deep learning. Additionally, he maintains a keen interest in human-computer interaction, aiming to bridge the gap between tech and human experience.
Table of Contents
Preface
Part 1: Fundamentals of Active Machine Learning
1
Introducing Active Machine Learning
Understanding active machine learning systems
Definition
Potential range of applications
Key components of active machine learning systems
Exploring query strategies scenarios
Membership query synthesis
Stream-based selective sampling
Pool-based sampling
Comparing active and passive learning
Summary
2
Designing Query Strategy Frameworks
Technical requirements
Exploring uncertainty sampling methods
Understanding query-by-committee approaches
Maximum disagreement
Vote entropy
Average KL divergence
Labeling with EMC sampling
Sampling with EER
Understanding density-weighted sampling methods
Summary
3
Managing the Human in the Loop
Technical requirements
Designing interactive learning systems and workflows
Exploring human-in-the-loop labeling tools
Common labeling platforms
Handling model-label disagreements
Programmatically identifying mismatches
Manual review of conflicts
Effectively managing human-in-the-loop systems
Ensuring annotation quality and dataset balance
Assess annotator skills
Use multiple annotators
Balanced sampling
Summary
Part 2: Active Machine Learning in Practice
4
Applying Active Learning to Computer Vision
Technical requirements
Implementing active ML for an image classification project
Building a CNN for the CIFAR dataset
Applying uncertainty sampling to improve classification performance
Applying active ML to an object detection project
Preparing and training our model
Analyzing the evaluation metrics
Implementing an active ML strategy
Using active ML for a segmentation project
Summary
5
Leveraging Active Learning for Big Data
Technical requirements
Implementing ML models for video analysis
Selecting the most informative frames with Lightly
Using Lightly to select the best frames to label for object detection
SSL with active ML
Summary
Part 3: Applying Active Machine Learning to Real-World Projects
6
Evaluating and Enhancing Efficiency
Technical requirements
Creating efficient active ML pipelines
Monitoring active ML pipelines
Determining when to stop active ML runs
Enhancing production model monitoring with active ML
Challenges in monitoring production models
Active ML to monitor models in production
Early detection for data drift and model decay
Summary
7
Utilizing Tools and Packages for Active ML
Technical requirements
Mastering Python packages for enhanced active ML
scikit-learn
modAL
Getting familiar with the active ML tools
Summary
Index
Other Books You May Enjoy
Preface
Welcome to Active Learning with Python a comprehensive guide designed to introduce you to the power of active machine learning. This book is written with the conviction that while data is plentiful, its quality and relevance hold the key to building models that are not only efficient but also robust and insightful.
Active machine learning is a method used in machine learning where the algorithm can query an oracle to label new data points with the desired outputs. It stands at the crossroads of optimization and human-computer interaction, enabling machines to learn more effectively with less data. This is particularly valuable in scenarios where data labeling is costly, time-consuming, or requires expert knowledge.
Throughout this book, we leverage Python, a leading programming language in the field of data science and machine learning, known for its simplicity and powerful libraries. Python serves as an excellent medium for exploring the concepts of active machine learning, providing both beginners and experienced practitioners with the tools needed to implement sophisticated models.
Who this book is for
This book is intended for data scientists, machine learning engineers, researchers, and anyone curious about optimizing machine learning workflows. Whether you are new to active machine learning or looking to enhance your current models, this book provides insights into making the most of your data through strategic querying and learning techniques.
What this book covers
Chapter 1
, Introducing Active Machine Learning, explores the fundamental principles of active machine learning, a highly effective approach that significantly differs from passive methods. This chapter also offers insights into its distinctive strategies and advantages.
Chapter 2
, Designing Query Strategy Frameworks, presents a comprehensive exploration of the most effective and widely utilized query strategy frameworks in active machine learning and covers uncertainty sampling, query-by-committee, expected model change, expected error reduction, and density-weighted methods.
Chapter 3
, Managing the Human in the Loop, discusses the best practices and techniques for the design of interactive active machine learning systems, with an emphasis on optimizing human-in-the-loop labeling. Aspects such as labeling interface design, the crafting of effective workflows, strategies for resolving model-label disagreements, the selection of suitable labelers, and their efficient management are covered.
Chapter 4
, Applying Active Learning to Computer Vision, covers various techniques for harnessing the power of active machine learning to enhance computer vision model performance in tasks such as image classification, object detection, and semantic segmentation, also addressing the challenges in their application.
Chapter 5
, Leveraging Active Learning for Big Data, explores the active machine learning techniques for managing big data such as videos, and acknowledges the challenges in developing video analysis models due to their large size and frequent data duplication based on frames-per-second rates, with a demonstration of an active machine learning method for selecting the most informative frames for labeling.
Chapter 6
, Evaluating and Enhancing Efficiency, details the evaluation of active machine learning systems, encompassing metrics, automation, efficient labeling, testing, monitoring, and stopping criteria, aiming for accurate evaluations and insights into system efficiency, guiding informed improvements in the field.
Chapter 7
, Utilizing Tools and Packages for Active ML, discusses the Python libraries, frameworks, and tools commonly used for active learning, highlighting their value in implementing various active learning techniques and offering an overview suitable for both beginners and experienced programmers.
To get the most out of this book
You should possess proficiency in Python coding and familiarity with Google Colab, alongside a foundational understanding of machine learning and deep learning principles.You also need to be familiar with machine learning frameworks like PyTorch.
This book is for individuals who possess a fundamental understanding of machine learning and deep learning and who aim to acquire knowledge about active learning in order to optimize the annotation process of their machine learning datasets. This optimization will enable them to train the most effective models possible.
You will need to create accounts for diverse tools: Encord, Roboflow, and Lightly. You will also need access to an AWS EC2 instance for Chapter 6
, Evaluating and Enhancing Efficiency.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Active-Machine-Learning-with-Python
. If there’s an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/
. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: We define x_true and y_true.
A block of code is set as follows:
y_true = np.array(small_dataset['label'])
x_true = np.array(small_dataset['text'])
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Anomaly detection is another domain where active learning proves to be highly effective.
Tips or important notes
Appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected]
and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata
and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected]
with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com
.
Share Your Thoughts
Once you’ve read Active Machine Learning with Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page
for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Download a free PDF copy of this book
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books