0% found this document useful (0 votes)
23 views

ML Assignment - 2

Uploaded by

Aswin Raj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

ML Assignment - 2

Uploaded by

Aswin Raj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

MAULANA AZAD NATIONAL INSTITUTE OF

TECHNOLOGY, BHOPAL
Department of Computer Science and Engineering
B.Tech. VI SEMESTER (Section-1), Machine Learning (CS321)

Assignment-2

Q.1: Given a dataset that represents the relationship between the number of years of work
experience (independent variable) and the monthly salary (dependent variable) of employees.
Your goal is to build a simple linear regression model to predict the monthly salary based on
the years of work experience.
Years of Experience Monthly Salary
1 3000
2 3500
3 4000
4 4500
5 5000

1. Calculate the coefficients (slope and intercept) for the linear regression model.
2. Interpret the meaning of the slope coefficient in the context of the problem.
3. Predict the monthly salary for an employee with 6 years of work experience.

Q.2: Suppose you are working on a medical dataset that contains information about patients'
health metrics (such as blood pressure, cholesterol levels, age) and whether they are at risk of
developing a particular disease (1 for at risk, 0 for not at risk). You are tasked with building a
logistic regression model to predict the risk of developing the disease based on these health
metrics.

Here's a sample dataset:

Age (A) Blood Pressure Cholesterol (C) Disease Risk


(B) (Output)
45 130 210 1
60 150 220 1
35 120 200 0
55 140 240 1
50 135 230 0

1. Using the provided dataset, calculate the coefficients (intercept and coefficients for each
feature) for the logistic regression model to predict the disease risk based on age, blood
pressure, and cholesterol levels.
2. Interpret the meaning of the coefficient for blood pressure in the context of the problem.
3. What is the predicted probability of a 40-year-old patient with blood pressure 125 and
cholesterol 190 being at risk of developing the disease?
Q.3: Consider the given dataset, build a binary logistic regression model to predict the
likelihood of a customer purchasing the product based on their demographic information (age,
income, gender). Interpret the coefficients and predict the probability of a 35-year-old male
with an income of $50,000 purchasing the product.

Age Income ($1000s) Gender Purchased (1/0)


25 35 Male 1
30 50 Female 0
35 45 Male 1
40 60 Female 1
45 55 Male 0
50 70 Female 1

Q.4: Build a linear regression model to predict the price of a house based on its features.
Here's a sample of the dataset:

Bed- Bath- Yard


Area Age Location Garage Stories Quality Price
rooms rooms Size
2000 3 2 20 Downtown 1 500 2 7 300000
2500 4 3 15 Suburbs 1 1000 1 8 400000
1800 3 2 10 Downtown 0 400 1 6 250000
3000 5 4 5 Suburbs 1 1200 2 9 500000
2200 4 3 8 Downtown 1 600 1 7 350000

1. Perform exploratory data analysis (EDA) on the dataset to understand the distribution of
features, identify outliers, and check for correlations between features and the target
variable (Price).
2. Preprocess the dataset by handling missing values, encoding categorical variables, and
scaling numerical features if necessary.
3. Split the dataset into training and testing sets (e.g., 80% training, 20% testing).
4. Build a linear regression model to predict house prices using the features provided.
5. Evaluate the performance of the model using appropriate metrics such as mean squared
error (MSE) or R-squared.
6. Interpret the coefficients of the model and discuss the significance of each feature in
predicting house prices.
7. Use the trained model to predict the price of a new house with the following features:
Area: 2800 sq. ft.
Bedrooms: 4
Bathrooms: 3
Age: 12 years
Location: Suburbs
Garage: Yes
YardSize: 800 sq. ft.
Stories: 2
Quality: 8

You might also like