hemraj_python_ass1
hemraj_python_ass1
SET A
1.Create 'sales' Data set having 5 columns namely: ID, TV, Radio, Newspaper
and Sales. (random 500 entries) Build a linear regression model by identifying
independent and target variable. Split the variables into training and testing
sets. then divide the training and testing sets into a 7:3 ratio, respectively and
print them. Build a simple linear regression model.
Program:-
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
sales_data = pd.DataFrame({
'ID': ID,
'TV': TV,
'Radio': Radio,
'Newspaper': Newspaper,
'Sales': Sales
})
# Step 2: Split the data into independent (X) and target (y) variables
X = sales_data[['TV', 'Radio', 'Newspaper']]
y = sales_data['Sales']
# Step 3: Split the dataset into training and testing sets (7:3 ratio)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
Example output:
realestate_data = pd.DataFrame({
'ID': ID,
'flat': flat,
'houses': houses,
'purchases': purchases
})
# Step 2: Split the data into independent (X) and target (y) variables
X_realestate = realestate_data[['flat', 'houses']]
y_realestate = realestate_data['purchases']
Copy
ID flat houses purchases
1 150.5 5.2 853.0
2 130.0 3.1 725.5
3 178.9 8.7 935.8
4 124.3 4.5 688.2
3) Create 'User' Data set having 5 columns namely: User ID, Gender, Age,
EstimatedSalary and Purchased. Build a logistic regression model that can
predict whether on the given parameter a person will buy a car or not.
Program:-
from sklearn.linear_model
import LogisticRegression
from sklearn.preprocessing
import LabelEncoder
from sklearn.metrics
import accuracy_score
user_data = pd.DataFrame({
'User ID': user_id,
'Gender': gender,
'Age': age,
'EstimatedSalary': estimated_salary,
'Purchased': purchased
})
# Step 3: Split the data into independent (X) and target (y) variables
X_user = user_data[['Age', 'EstimatedSalary', 'Gender']]
y_user = user_data['Purchased']
Output:-
Accuracy of the Logistic Regression Model: 0.89
SET B
1) Build a simple linear regression model for Fish Species Weight Prediction.
(download dataset https://www.kaggle.com/aungpyaeap/fish-
market?select=Fish.csv)
Program:-
import pandas as pd
from sklearn.linear_model
import LinearRegression
from sklearn.model_selection
import train_test_split
# Step 2: Split the data into independent (X) and target (y) variables
X_fish = fish_data[['Length', 'Width', 'Height']]
y_fish = fish_data['Weight']
# Step 3: Split the dataset into training and testing sets
X_train_fish, X_test_fish, y_train_fish, y_test_fish = train_test_split(X_fish, y_fish,
test_size=0.3, random_state=42)
2) Use the iris dataset. Write a Python program to view some basic statistical
details like percentile, mean, std etc. of the species of 'Iris- setosa', 'Iris-
versicolor' and 'Iris-virginica'. Apply logistic regression on the dataset to
identify different species (setosa, versicolor, verginica) of Iris flowers given
just 4 features: sepal and petal lengths and widths.. Find the accuracy of the
model.
Program:-
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score