Python Machine Learning Projects: Learn how to build Machine Learning projects from scratch (English Edition)
()
About this ebook
The book starts by explaining how important Machine Learning is today and the technology required to make it work. The book then helps you get familiar with basic concepts that underlie Machine Learning, including basic Python Programming. It explains different types of Machine Learning algorithms and how they can be applied in various domains like Recommendation Systems, Text Analysis and Mining, Image Processing, and Social Media Analytics. Towards the end, the book briefly introduces you to the most popular metaheuristic algorithms for optimization.
By the end of the book, you will develop the skills to use Machine Learning effectively in various application domains.
Related to Python Machine Learning Projects
Related ebooks
Beginning with Machine Learning: The Ultimate Introduction to Machine Learning, Deep Learning, Scikit-learn, and TensorFlow (English Edition) Rating: 0 out of 5 stars0 ratingsAI and ML for Coders: AI Fundamentals Rating: 0 out of 5 stars0 ratingsAdvanced Machine Learning with Python Rating: 0 out of 5 stars0 ratingsMachine Learning in Production: Master the art of delivering robust Machine Learning solutions with MLOps (English Edition) Rating: 0 out of 5 stars0 ratingsPython Machine Learning: Introduction to Machine Learning with Python Rating: 0 out of 5 stars0 ratingsPython Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python Rating: 0 out of 5 stars0 ratingsMachine Learning with Tensorflow: A Deeper Look at Machine Learning with TensorFlow Rating: 0 out of 5 stars0 ratingsApplied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition) Rating: 0 out of 5 stars0 ratingsDeep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition) Rating: 0 out of 5 stars0 ratingsApplied Machine Learning Solutions with Python: SOLUTIONS FOR PYTHON, #1 Rating: 0 out of 5 stars0 ratingsMachine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques Rating: 5 out of 5 stars5/5Mathematics for Machine Learning: A Deep Dive into Algorithms Rating: 0 out of 5 stars0 ratingsFundamentals of Machine Learning: An Introduction to Neural Networks Rating: 0 out of 5 stars0 ratingsMastering Data Science: From Basics to Expert Proficiency Rating: 0 out of 5 stars0 ratingsPython Machine Learning By Example Rating: 4 out of 5 stars4/5Machine Learning for Finance Rating: 5 out of 5 stars5/5Machine Learning with Spark and Python: Essential Techniques for Predictive Analytics Rating: 0 out of 5 stars0 ratingsMachine Learning - A Comprehensive, Step-by-Step Guide to Intermediate Concepts and Techniques in Machine Learning: 2 Rating: 0 out of 5 stars0 ratingsData Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition) Rating: 0 out of 5 stars0 ratingsOperationalizing Machine Learning Pipelines: Building Reusable and Reproducible Machine Learning Pipelines Using MLOps Rating: 0 out of 5 stars0 ratingsPython Automation Mastery: From Novice To Pro Rating: 0 out of 5 stars0 ratingsMachine Learning for Time Series Forecasting with Python Rating: 4 out of 5 stars4/5
Computers For You
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5Elon Musk Rating: 4 out of 5 stars4/5How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally Rating: 4 out of 5 stars4/5Technical Writing For Dummies Rating: 0 out of 5 stars0 ratings101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution Rating: 4 out of 5 stars4/5The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms Rating: 0 out of 5 stars0 ratingsThe Professional Voiceover Handbook: Voiceover training, #1 Rating: 5 out of 5 stars5/5Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Rating: 4 out of 5 stars4/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsLearning the Chess Openings Rating: 5 out of 5 stars5/5Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls Rating: 4 out of 5 stars4/5Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Tor and the Dark Art of Anonymity Rating: 5 out of 5 stars5/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 5 out of 5 stars5/5Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning Rating: 5 out of 5 stars5/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsUncanny Valley: A Memoir Rating: 4 out of 5 stars4/5
Reviews for Python Machine Learning Projects
0 ratings0 reviews
Book preview
Python Machine Learning Projects - Dr. Deepali R Vora
CHAPTER 1
Introduction to ML
Introduction
The term machine learning was coined by Arthur Samuel in 1959. The basic idea behind the coined term was Can machines do what we as humans can do?
rather than asking Can machines think?
These questions led to the development of machine learning where, just like human beings, machines tend to learn from experience. The aim is to improve the performance with experience. This chapter introduces the related terms, such as data science, data mining, artificial intelligence, machine learning and deep learning. The major focus is to familiarize you with the methods and techniques used in machine learning. To understand the core concepts, this chapter explores the working process of any given machine learning algorithm. In the recent years, machine learning technology has improved drastically, which is elaborated through the various applications, limitations and the challenges faced while developing the machine learning algorithms.
Structure
In this chapter, we will discuss the following topics:
Introduction to Machine Learning
Models of Machine Learning
Supervised machine learning model through training
Unsupervised machine learning model
Semi - structured machine learning model
Reinforcement machine learning model
Working of Machine Learning algorithm
Challenges for Machine Learning Projects
Limitations of Machine Learning
Application areas of Machine Learning
Difference between the terms data science, data mining, machine learning and deep learning
Objectives
On completion of this chapter, you will be able to understand the various key terms and the fundamentals of machine learning. Additionally, you will be able to understand the types, the models and the mechanism of machine learning. You will become familiar with the challenges and limitations observed while applying machine learning algorithms. These will be elaborated through a number of application areas along with the differences between the various related technologies employed.
Introduction to Machine Learning (ML)
Machine learning is defined as the ability of a computer system to learn from the environment where the data is provided through enabling algorithms. These algorithms gather insights and take data-driven decisions with minimal human intervention. This enables us to make predictions on previously unanalyzed data. So instead of writing code, one just needs to feed the data to the generic algorithm, and the algorithm/machine builds the logic based on the given data. This helps improve the output from its experience without the need for any explicit programming. Thus, it can be said that machine learning is a term closely associated with data science, and it involves observing and studying data or experiences to identify patterns and set up a reasoning system based on the findings. Another way of describing machine learning is to say that it is a science of building hardware or software that can achieve tasks by learning from examples.
The examples come as {input, output} pairs. When new inputs are given, a trained machine can predict the output. For example, recommendations for online products purchased or suggestions for the persons who bought the product is machine learning. The terms used in machine learning include a target that is called a label, a variable used in statistics that is called a feature, and transformation in statistics that is called feature creation.
Models in Machine Learning
Machine learning models can be categorized into three basic models: supervised, unsupervised and reinforcement learning.
Supervised machine learning model through training
Supervised learning is said to be a learning obtained by training a machine or a model. On getting trained, predictions can be made for new data, also known as test data.
This model works on the data provided to the system. The data available is divided into training and testing data. A supervised learning model analyses the given training data and draws inferences from it. Therefore, mapping between the input and output pair and proper labeling of data is crucial in supervised machine learning models.
The major aim of a supervised model is to utilize historical data, understand its behavior and determine future forecasts based on the historical data available and maintained in the database.
For example, to differentiate between plants and flowers, a few labeled pictures of both categories need to be fed. This will enable the machine learning algorithm to differentiate between, identify and learn about them based on their characteristics. Once the algorithm is trained, classifying the remaining images would become a lot easier for the algorithm.
Unsupervised machine learning model: Self-sufficient learning
Such a model does not use any classified or labelled parameters. It learns through observation and finds structures in the data. It focuses on discovering hidden structures, patterns and relationships from unlabeled data, which enhances the functionality of the system by creating a number of clusters for further analysis.
The unsupervised system or its algorithms are not given the right answer.
The algorithm is expected to determine, interpret, and review its data and conclude from what is being shown to them through the iterative deep learning approach. Unsupervised learning can use both generative learning models and a retrieval-based approach.
These algorithms use self-organizing maps, nearest-neighbor mapping, k-means clustering and singular value decomposition as its techniques for operations.
Unsupervised learning also known as neural networks works well for recognizing images, performing operations on transactional data, performing speech to text conversion and natural language generation.
Semi-supervised machine learning model
This model is a combination of supervised and unsupervised learning. It works using a small amount of labeled and large amount of unlabeled data for training to improve the learning accuracy.
Here, all the unlabeled data is fed in, and the machine applies the various algorithms, such as classification, regression and prediction. Then, it understands the characteristics and classifies the information from the data provided.
This learning provides an effective solution when the cost associated with labeling is too high.
Reinforcement machine learning model: Hit and Trial
This learning model interacts with the environment on a trial-and-error basis, determining the best outcome. Reinforcement learning comprises of three primary components: the agent (the learner or decision maker), the environment (everything the agent interacts with) and actions (what the agent can do). The agent here is rewarded or penalized with a point. Based on the actions they take, the output should be maximized over a given amount of time. Steps that produce favorable outcomes are rewarded, and steps that produce undesired outcomes are penalized until the algorithm learns the optimal process.
Thus, the goal in reinforcement learning is to learn the best policy to obtain maximum rewards.
Reinforcement learning is often used for robotics, gaming and navigation. For example, to understand a game of chess, an ML algorithm will not analyze individual moves; it will study the game as a whole.
Types of Machine Learning Algorithms
Based on the various models developed, Machine learning is sub-categorized into three types: supervised, unsupervised and reinforcement-based algorithms, as represented in Figure 1.1:
Figure 1.1: Types of Machine Learning
As mentioned in Figure 1.1, supervised and unsupervised algorithms work on continuous and categorical data that is provided. The supervised algorithms that work on continuous data include regression, decision trees, random forests and classification based methods for categorical data. Unsupervised algorithms include clustering, association analysis and the Hidden Markov model.
Working of Machine Learning algorithm
Machine Learning algorithm is trained using a training data set. This training leads to creation of a model. When new input data is introduced to this model, the ML algorithm makes a prediction.
The following figure depicts the working of ML algorithm:
Figure 1.2: Working of ML algorithm
If the accuracy of the prediction obtained is acceptable, the associated machine learning algorithm is deployed. However, if the accuracy obtained is not acceptable, the algorithm is repeatedly trained with an augmented training data set till its accuracy gets acceptable, as represented in Figure 1.2.
Challenges for Machine Learning Projects
Employing a machine learning method can be extremely tedious, but it can serve as a revenue charger for a company. While machine learning is still evolving, various challenges faced by the organizations include the following:
Understanding the limits of contemporary machine learning technology
Many companies expect the algorithms to learn quickly and deliver precise predictions to complex queries. Therefore, they face the challenge of educating customers about the possible applications of their innovative technology. A business working on a practical machine learning application needs to invest time and resources and take substantial risks. Also, machine learning engineers and data scientists cannot guarantee that the training process of a model can be replicated. Therefore, frequent tests should be done to develop the best possible and desired outcomes.
The black box problem
Understanding how the algorithms work made the ML models simple and shallow methods in the early days. However, as ML algorithms work on large sets of data, algorithms have moved from neural networks to deep learning algorithms. These networks gave good accuracy, but the logic and the models developed to do so was not known. This problem is called a black box problem.
Data collection
Immense amount of data is available to train a machine learning model. Data engineering plays a significant role in collecting and streamlining the data. Though data storage has become cheap, it requires time to collect enough data or buy ready sets of data, which is also expensive. Additionally, preparing data for algorithm training is a complicated process. To do that, one needs:
To know the problem that the algorithm needs to solve,
To establish data collection mechanisms and train the algorithm,
Data reduction with attribute sampling, record sampling, or aggregating;
To decompose the data and rescale it, a complex and expensive task;
Data privacy is yet another challenge. Differentiating between sensitive and insensitive data is essential to implement machine learning correctly and efficiently.
Feature Selection
In a dataset, you may have 1000 features or more, but only selected features help in the prediction; selecting those features is the challenging task. Here, domain knowledge and data scientist expertise come into the picture.
It is a challenge to select the smallest possible subset of input variables (features) that will give optimal or the best predictions.
Performance prediction
Once optimized predictions are made, the challenge is to predict their generalized performance on new, unseen data.
For measuring the performance, we have parameters to check; for example, we have classification matrix, accuracy, recall, precision, roc-auc curve and so on in classification problems, and we have accuracy, mean squared error, Adjusted R2 squared error and so on in regression problems.
To overcome the various challenges faced, the focus should be on understanding the concepts behind data collection, feature selection and extraction, applying the right algorithm and predicting the performance and accuracy of the applied algorithms.
Limitations of machine learning
The applications of machine learning are observed in various domains, from the financial and retail sectors to agriculture, banking and many more. However, machine learning algorithms have their own limitations, mentioned as follows:
Each narrow application needs to be specially trained
Requires large amounts of hand-crafted, structured training data
Learning must generally be supervised: training data must be tagged
Require lengthy offline/ batch training
Does not learn incrementally or interactively in real time
Poor transfer learning ability, reusability of modules and integration
Systems are opaque, making them very hard to debug
Performance cannot be audited or guaranteed
They encode correlation, not causation or ontological relationships
Do not encode entities, or spatial relationships between entities
Only handle very narrow aspects of natural language
To obtain the best results, various techniques need to be properly employed while considering all limitations.
Application areas of ML
Various industries have embraced machine learning technologies to gain competitive advantage. Domains that utilize machine learning include the following:
Financial services
Predictions are made by applying appropriate machine learning algorithms to identify two important factors: determining insights into data for identifying investment opportunities or helping investors to know when to trade. Additionally, various data mining and machine learning algorithms are utilized to perform cyber surveillance to prevent frauds from occurring, and to identify clients with high-risk profiles in banking sector and other businesses in the financial industry.
Government
For a government organization, there is a huge collection of data obtained from multiple sources, and records are maintained over years. Such data forms the prime source from where data can be mined, and insights generated to improve public safety and provide utilities for the welfare of the society. Machine learning can play an important role in determining such services. It can also help detect fraud and minimize identity theft.
Health care
Due to the advent of wearable devices, sensor technology incorporated in the healthcare sector can provide access to a patient’s health in real time. This data accumulation can help medical experts analyze the data, and identify trends and critical areas, which can lead to improved diagnosis and treatment. Machine learning models can be developed and trained to provide insights into the patient’s health.
Retail
Online shopping is the trend these days. Business organizations are providing services like recommending products and their combinations that have been previously purchased based on the buying history. Retailers rely on machine learning to capture data, analyze it and use it to personalize a shopping experience and provide customer insights along with price optimization.
Oil and gas
Machine learning models are being trained to determine new energy sources, analyze minerals in the ground, predict refinery sensor failure, and streamline oil distribution to make the refineries more efficient and cost-effective.
Transportation
The transportation industry majorly relies on making routes more efficient and predicting potential problems. So, machine learning algorithms to be developed have to analyze, the incoming data needs to be modeled. This can further be used to identify patterns and trends to generate optimized routes. Thus, machine learning is an important tool for various transportation organizations.
Other set of applications used include the following:
Facebook’s News Feed
The News Feed uses machine learning to personalize each member’s feed. If a member frequently stops scrolling to read a particular friend’s posts, the News Feed will start showing more of that friend’s activity earlier in the feed. This is done by performing statistical analysis and predictive analytics to identify patterns in the user’s data and using those patterns to populate the News Feed.
Enterprise application
In the Customer Relationship Management (CRM) system, machine learning utilizes learning models to analyze the emails and informs the sales team members about the most important mails to respond to first. Similarly, Human resource (HR) systems use machine learning models to identify the characteristics of effective employees and use this to find the best applicants for open positions.
Machine learning software help users automatically identify important data points for providing business insights and intelligence.
Self-driving cars
Machine learning uses deep learning neural networks to identify objects and determine optimal actions for safely steering a vehicle.
Virtual assistant
In order to interpret natural speech, personal schedules or previously defined preferences and take action, smart assistants utilize machine learning technology through several deep learning models.
Difference between the terms data science, data mining, machine learning and deep learning
Although each of these methods have the same goals, i.e., to extract insights, patterns and relationships that can be used to make decisions, each one has a different approach and ability to determine them, as indicated by Table 1.1.
Data Science and Data Mining
Data science deals with the entire scope of collecting and processing data, while data mining involves analyzing large amounts of data to discover patterns and other useful information.
Data mining involves the use of traditional statistical methods to identify previously unknown patterns from data, such as statistical algorithms, machine learning, text analytics and time series analysis. It also includes the study and practice of data storage and data manipulation. Refer the table Table 1.1:
Table 1.1: Difference between data science and data mining
Machine learning and deep learning
Machine learning is developed based on its ability to probe data for its structures. It uses an iterative approach to learn from data, so the learning is easily automated.
Deep learning combines advances in computing power and neural networks to learn complicated patterns in large amounts of data for identifying objects in images and words in sounds, as indicated by Table 1.2:
Table 1.2: Difference between machine learning and deep learning
Conclusion
This chapter introduced you to the key terms used in data science, data mining, artificial intelligence, machine learning and deep learning. Various types of machine learning algorithms, the methods applied and models deployed for learning were also explored here. The working of the machine learning algorithms and the challenges faced while developing the machine learning algorithm were then specified. This was followed by clearing the concepts of machine learning with the help of several examples. We also looked at the difference between data science, data mining, machine learning and deep learning technologies.
In the next chapter, the very popular Python language will be introduced. Then, we will explore code editors, IDE and the working of Python, along with its basic structure, control statements, and exception handling in detail.
Questions and Answers
1. Define the terms machine learning, data science and artificial intelligence with examples.
Ans. Machine learning (ML) algorithms enable computers to learn from data and even improve themselves without being explicitly programmed. You may already be using a device that utilizes ML, for example, a wearable fitness tracker like Fitbit or an intelligent home assistant like Google Home.
Data science: Data science is the technology that studies the data trend to make the best possible use of data. Machine learning algorithm, deep learning algorithm and Python libraries make the data science domain more powerful. It includes the following:
Statistics (traditional analysis you’re used to thinking about)
Visualization (Graphs and tools)
Data science is used in domains like fraud detection, share market prediction, disease prediction, chatbot, social media management, user profiling and speech recognition.
Artificial intelligence: It is a technique that makes machines capable enough to perform tasks that humans are not capable of. These machines are not only trained to do tasks but also improve over time. For example, carrying a tray of drinks through a crowded bar and serving them to the correct customer is something servers do every day, but it is a complex exercise in decision-making and based on a high volume of data being transmitted between neurons in the human brain.
In broader terms, machine learning and deep learning are the subset of artificial intelligence and data science’s tools to get the data ready for the implementation whether it is cleaning, exploratory data analysis, imputation or splitting.
2. Explain the various models used in machine learning.
Ans. Let us talk about the various models in machine learning.
Linear regression: It