EPS DL Handout1 Introduction Compressed
EPS DL Handout1 Introduction Compressed
SESSION-1
BY
ASHA
Artificial Intelligence
• Artificial Intelligence refers to the development of computer
systems that can perform tasks that typically require human
intelligence.
Machine Learning
• Machine Learning involves algorithms and statistical models that
data.
Computer Vision (CV) is a field of artificial intelligence that enables
computers to interpret and understand the visual world. Using
digital images from cameras and videos and deep learning models,
machines can accurately identify and classify objects, and then
react to what they "see."
Natural language processing
In 1959, Arthur Samuel, a computer scientist who pioneered the study of artificial
intelligence, described machine learning as “The study that gives computers the ability to
learn.”
Machine learning is a subset of AI, which enables the machine to automatically learn from
data, improve performance from past experiences, and make predictions.
Machine learning is a subset of artificial intelligence that aims to
mimic how human beings learn by using data.
A more technical definition given by Tom M. Mitchell’s (1997) : “A computer program
is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves
with experience E.”
How Machine learning works?
How Machine learning works?
DATASET
• A dataset is a collection of data in which data is arranged in some order. A
dataset can contain any data from a series of an array to a database table.
1. Missing data
2. Noisy data
3. Inconsistent data
Why is Data Preprocessing important?
The majority of the real-world datasets for machine learning are highly
susceptible to be missing, inconsistent, and noisy.
• Data Processing is, therefore, important to improve the overall data quality.
• Outliers and incOnsistent data pOints Often tend tO disturb the mOdel’s
overall learning, leading to false predictions.
Data Reduction
• The size of the dataset in a data warehouse can be too large to
be handled by data analysis and data mining algorithms.
values, imputing missing values with the mean, median, or mode, or using
• Feature Scaling:
standard deviation of 1.
Encoding Categorical Data:
Splitting Data:
•Dividing the dataset into training and testing sets to evaluate the model's
performance.
Machine learning types ?
ALGORITHM DEVELOPMENT STEPS
Basic terminology
Features and Labels:
•Features: The input variables (independent variables) used by the model to make
predictions.
•Labels: The output variable (dependent variable) that the model is trying to predict.
Training and Testing:
•Training Set: A subset of the dataset used to train the model.
•Testing Set: A subset of the dataset used to evaluate the model's performance.
Overfitting and Underfitting:
•Overfitting: When the model performs well on the training data but poorly on the testing
data because it has learned noise and details from the training data.
•Underfitting: When the model performs poorly on both the training and testing data
because it is too simple to capture the underlying patterns in the data.
What libraries do we use for
Machine learning?
NumPy
• NumPy – Numerical python is a very popular python library for array and
matrix processing, with the help of a large collection of high-level
mathematical functions.
• It is very useful for fundamental scientific computations in Machine
Learning.
Pandas
• Pandas-Panel data is a popular Python library for data analysis.
• It is not directly related to Machine Learning but the dataset must be
prepared before training for which Pandas are useful as it is developed
specifically for data extraction and preparation.
• It provides data structures and wide variety tools for data analysis. It
provides many inbuilt methods for filtering, combining and grouping
data.
Matplotlib
• Matplotlib is a Python library for data visualization. Like Pandas, it is
not directly related to Machine Learning. It is needed when a
programmer wants to visualize the patterns in the data.
• It can train and run deep neural networks that can be used to develop
several AI applications. TensorFlow is widely used in the field of deep
learning research and application.
THANK YOU