0% found this document useful (0 votes)

31 views

Week-5 - Jupyter Notebook

Uploaded by

pramidibalu2005

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Week-5 - Jupyter Notebook

Uploaded by

pramidibalu2005

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [1]: import sklearn

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [4]: from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from xgboost import XGBClassifier
from sklearn import metrics

In [5]: import warnings

In [6]: warnings.filterwarnings('ignore')

In [11]: #LOAD DATASET

In [7]: df=pd.read_csv('TSLA.csv')

In [9]: df.head()

Out[9]:
Date Open High Low Close Adj Close Volume

0 2010-06-29 19.000000 25.00 17.540001 23.889999 23.889999 18766300

1 2010-06-30 25.790001 30.42 23.299999 23.830000 23.830000 17187100

2 2010-07-01 25.000000 25.92 20.270000 21.959999 21.959999 8218800

3 2010-07-02 23.000000 23.10 18.709999 19.200001 19.200001 5139800

4 2010-07-06 20.000000 20.00 15.830000 16.110001 16.110001 6866900

In [10]: df.tail()

Out[10]:
Date Open High Low Close Adj Close Volume

2411 2020-01-28 568.489990 576.809998 558.080017 566.900024 566.900024 11788500

2412 2020-01-29 575.690002 589.799988 567.429993 580.989990 580.989990 17801500

2413 2020-01-30 632.419983 650.880005 618.000000 640.809998 640.809998 29005700

2414 2020-01-31 640.000000 653.000000 632.520020 650.570007 650.570007 15719300

2415 2020-02-03 673.690002 786.140015 673.520020 780.000000 780.000000 47065000

In [12]: #EXPLORE dimensions

print('number of data columns:',df.shape[1],'\nnumber of data rows:',df.shape[0])

number of data columns: 7

number of data rows: 2416

localhost:8888/notebooks/Week-5.ipynb# 1/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [13]: df.describe()

Out[13]:
Open High Low Close Adj Close Volume

count 2416.000000 2416.000000 2416.000000 2416.000000 2416.000000 2.416000e+03

mean 186.271147 189.578224 182.916639 186.403651 186.403651 5.572722e+06

std 118.740163 120.892329 116.857591 119.136020 119.136020 4.987809e+06

min 16.139999 16.629999 14.980000 15.800000 15.800000 1.185000e+05

25% 34.342498 34.897501 33.587501 34.400002 34.400002 1.899275e+06

50% 213.035004 216.745002 208.870002 212.960007 212.960007 4.578400e+06

75% 266.450012 270.927513 262.102501 266.774994 266.774994 7.361150e+06

max 673.690002 786.140015 673.520020 780.000000 780.000000 4.706500e+07

In [14]: df.info() #Summary of dataframe

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2416 entries, 0 to 2415
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 2416 non-null object
1 Open 2416 non-null float64
2 High 2416 non-null float64
3 Low 2416 non-null float64
4 Close 2416 non-null float64
5 Adj Close 2416 non-null float64
6 Volume 2416 non-null int64
dtypes: float64(5), int64(1), object(1)
memory usage: 132.2+ KB

In [15]: df['date']=pd.to_datetime(df.Date)

In [16]: df.date.dtype

Out[16]: dtype('<M8[ns]')

EXPLORATORY DATA ANALYSIS

localhost:8888/notebooks/Week-5.ipynb# 2/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [17]: plt.figure(figsize=(15,5))
sns.lineplot(data=df,x='date',y='Close')
plt.title('Tesla Close Price',fontsize=15)
plt.ylabel('Price in dollars')
plt.show()

In [20]: #Check for same

In [19]: df[df['Close']==df['Adj Close']].shape

Out[19]: (2416, 8)

In [21]: df.drop(['Adj Close','date'],axis=1,inplace=True)

In [23]: df.head()

Out[23]:
Date Open High Low Close Volume

0 2010-06-29 19.000000 25.00 17.540001 23.889999 18766300

1 2010-06-30 25.790001 30.42 23.299999 23.830000 17187100

2 2010-07-01 25.000000 25.92 20.270000 21.959999 8218800

3 2010-07-02 23.000000 23.10 18.709999 19.200001 5139800

4 2010-07-06 20.000000 20.00 15.830000 16.110001 6866900

In [24]: df.isnull().sum()

Out[24]: Date 0
Open 0
High 0
Low 0
Close 0
Volume 0
dtype: int64

localhost:8888/notebooks/Week-5.ipynb# 3/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [26]: features=['Open','High','Low','Close','Volume']
plt.subplots(figsize=(20,10))
for i,col in enumerate(features):
plt.subplot(2,3,i+1)
sns.distplot(df[col])
plt.show()

In [28]: #For outliers

In [27]: plt.subplots(figsize=(20,10))
for i,col in enumerate(features):
plt.subplot(2,3,i+1)
sns.boxplot(df[col])
plt.show()

FEATURE ENGINEERING

localhost:8888/notebooks/Week-5.ipynb# 4/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

Feature Construction

In [29]: splitted=df['Date'].str.split('-',expand=True)

In [30]: df['Day']=splitted[2].astype('int')
df['Month']=splitted[1].astype('int')
df['Year']=splitted[0].astype('int')

In [31]: df.drop('Date',axis=1,inplace=True)

In [32]: df.head()

Out[32]:
Open High Low Close Volume Day Month Year

0 19.000000 25.00 17.540001 23.889999 18766300 29 6 2010

1 25.790001 30.42 23.299999 23.830000 17187100 30 6 2010

2 25.000000 25.92 20.270000 21.959999 8218800 1 7 2010

3 23.000000 23.10 18.709999 19.200001 5139800 2 7 2010

4 20.000000 20.00 15.830000 16.110001 6866900 6 7 2010

Month-3,6,9,12 value 1 else 0

In [33]: df['is_quarter_end']=np.where(df['Month']%3==0,1,0)

In [34]: df.head()

Out[34]:
Open High Low Close Volume Day Month Year is_quarter_end

0 19.000000 25.00 17.540001 23.889999 18766300 29 6 2010 1

1 25.790001 30.42 23.299999 23.830000 17187100 30 6 2010 1

2 25.000000 25.92 20.270000 21.959999 8218800 1 7 2010 0

3 23.000000 23.10 18.709999 19.200001 5139800 2 7 2010 0

4 20.000000 20.00 15.830000 16.110001 6866900 6 7 2010 0

In [39]: df.grouped=df.groupby('Year').mean() #GROUPING BY YEAR

localhost:8888/notebooks/Week-5.ipynb# 5/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [40]: plt.subplots(figsize=(20,10))
for i,col in enumerate(['Open','High','Low','Close']):
plt.subplot(2,2,i+1)
data_grouped[col].plot.bar()
plt.show()

In [42]: df.groupby('is_quarter_end').mean()

Out[42]:
Open High Low Close Volume Day Month

is_quarter_end

0 185.875081 189.254226 182.449499 186.085081 5.767062e+06 15.710396 6.173886 2014.

1 187.071200 190.232700 183.860262 187.047163 5.180154e+06 15.825000 7.597500 2014.

In [43]: df['open-close']=df['Open']-df['Close']
df['high-low']=df['High']-df['Low']
df['target']=np.where(df['Close'].shift(-1) > df['Close'],1,0)

localhost:8888/notebooks/Week-5.ipynb# 6/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [44]: plt.figure(figsize=(10,10))
sns.heatmap(df.corr()>0.9,annot=True,cbar=False)
plt.show()

localhost:8888/notebooks/Week-5.ipynb# 7/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [45]: plt.pie(df['target'].value_counts().values,labels=[0,1],autopct='%1.1f%%')
plt.show()

In [46]: features=df[['open-close','high-low','is_quarter_end']]
target=df['target']

In [47]: scaler=StandardScaler()
features=scaler.fit_transform(features)

In [48]: #SPLIT DATASET

x_train,x_test,y_train,y_test=train_test_split(features,target,test_size=0.1,random_st
print(x_test.shape,x_train.shape)

(242, 3) (2174, 3)

In [ ]:

localhost:8888/notebooks/Week-5.ipynb# 8/9
8/5/24, 11:16 AM Week-5 - Jupyter Notebook

In [50]: #MODEL DEVELOPMENT & EVALUATION

class CustomXGBClassifier(XGBClassifier):
def __repr__(self):
return "XGBClassifier"
models=[LogisticRegression(),SVC(kernel='poly',probability=True),CustomXGBClassifier(
for model in models:
model.fit(x_train,y_train)
training_accuracy=metrics.roc_auc_score(y_train,model.predict_proba(x_train)[:,1]
validation_accuracy=metrics.roc_auc_score(y_test,model.predict_proba(x_test)[:,1]
print(model)
print("Training Accuracy:",training_accuracy)
print("Validation Accuracy:",validation_accuracy)

LogisticRegression()
Training Accuracy: 0.5228802330060918
Validation Accuracy: 0.4923371647509579
SVC(kernel='poly', probability=True)
Training Accuracy: 0.4704775693536028
Validation Accuracy: 0.5374247400109469
XGBClassifier
Training Accuracy: 0.943461732220797
Validation Accuracy: 0.4487889983579639

In [ ]:

localhost:8888/notebooks/Week-5.ipynb# 9/9

OptiX OSN 1800 Commissioning and Configuration Guide (V100R001)
67% (3)
OptiX OSN 1800 Commissioning and Configuration Guide (V100R001)
228 pages
HPE Solutions For Qumulo: No More Data Blindness
No ratings yet
HPE Solutions For Qumulo: No More Data Blindness
65 pages
ML Report Miniproject
No ratings yet
ML Report Miniproject
11 pages
Case - Study - MachineLearning - NSE - TATA - GLOBAL - Data - Prediction - Jupyter Notebook
No ratings yet
Case - Study - MachineLearning - NSE - TATA - GLOBAL - Data - Prediction - Jupyter Notebook
10 pages
Project Ml Code
No ratings yet
Project Ml Code
132 pages
Bitcoin Prise Using LSTM.ipynb - Colab
No ratings yet
Bitcoin Prise Using LSTM.ipynb - Colab
49 pages
Airbnbsp
No ratings yet
Airbnbsp
9 pages
Chap 1: Preparing Data and A Linear Model: Explore The Data With Some EDA
No ratings yet
Chap 1: Preparing Data and A Linear Model: Explore The Data With Some EDA
27 pages
DMV - 6 - Jupyter Notebook
No ratings yet
DMV - 6 - Jupyter Notebook
6 pages
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
No ratings yet
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
11 pages
STOCK - MARKET - PROJECT - Jupyter Notebook
No ratings yet
STOCK - MARKET - PROJECT - Jupyter Notebook
24 pages
tesla_time_series
No ratings yet
tesla_time_series
18 pages
Netflix Stock Price Prediction
No ratings yet
Netflix Stock Price Prediction
20 pages
Project Intern - Jupyter Notebook
No ratings yet
Project Intern - Jupyter Notebook
16 pages
IML project
No ratings yet
IML project
6 pages
Cia 1.1
No ratings yet
Cia 1.1
7 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
Gestión de Carteras Mapa de Calor
No ratings yet
Gestión de Carteras Mapa de Calor
1 page
GT - Assignment
No ratings yet
GT - Assignment
8 pages
lab record dev
No ratings yet
lab record dev
20 pages
Bitcoine Data Analysis
No ratings yet
Bitcoine Data Analysis
7 pages
Moving Average Cross Strategy
No ratings yet
Moving Average Cross Strategy
1 page
Reading and Plotting Stock Data Notes
No ratings yet
Reading and Plotting Stock Data Notes
2 pages
FINAL-Nguyễn Quỳnh Chi-2013316663
No ratings yet
FINAL-Nguyễn Quỳnh Chi-2013316663
1 page
One Hot Encoding
No ratings yet
One Hot Encoding
12 pages
Coding Tugas Besar Analitika Data
No ratings yet
Coding Tugas Besar Analitika Data
7 pages
10 - Jayesh - Prakash - Rane
No ratings yet
10 - Jayesh - Prakash - Rane
26 pages
Trading Results Analysis
No ratings yet
Trading Results Analysis
27 pages
Markets
No ratings yet
Markets
5 pages
Machine Learning Stock Time Series 1700932258
No ratings yet
Machine Learning Stock Time Series 1700932258
21 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
Week 10 Intro Time Series
No ratings yet
Week 10 Intro Time Series
34 pages
ModuleAr Merged
No ratings yet
ModuleAr Merged
42 pages
Pyhtonpractice Questions
No ratings yet
Pyhtonpractice Questions
5 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Ipo Eda
No ratings yet
Ipo Eda
76 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
Pandas
No ratings yet
Pandas
5 pages
Task 6
No ratings yet
Task 6
14 pages
History of Code
No ratings yet
History of Code
37 pages
Retail Analysis Walmart
No ratings yet
Retail Analysis Walmart
18 pages
Stock_class_py - Jupyter Notebook
No ratings yet
Stock_class_py - Jupyter Notebook
5 pages
Code
No ratings yet
Code
13 pages
LSTM Stock Prediction
100% (1)
LSTM Stock Prediction
38 pages
EDP-3[2]
No ratings yet
EDP-3[2]
16 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
CODE1
No ratings yet
CODE1
5 pages
Lab Stocks Clustering Jerarquico Daniel Ames Camayo
No ratings yet
Lab Stocks Clustering Jerarquico Daniel Ames Camayo
12 pages
Lunc Prediction
No ratings yet
Lunc Prediction
6 pages
Data Preprocessing Techniques in ML
No ratings yet
Data Preprocessing Techniques in ML
12 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
Assignment 4 On Visualization On Graph With Solution
No ratings yet
Assignment 4 On Visualization On Graph With Solution
14 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Codigo base stocks prediction LSTM Thushan GAnegedara
No ratings yet
Codigo base stocks prediction LSTM Thushan GAnegedara
3 pages
Stock Market Analysis ? pro2 my
No ratings yet
Stock Market Analysis ? pro2 my
32 pages
Regression and Eda
No ratings yet
Regression and Eda
47 pages
Math Reproducibles - Grade 2
From Everand
Math Reproducibles - Grade 2
Vicky Shiotsu
No ratings yet
Develop Snakes & Ladders Game Complete Guide with Code & Design
From Everand
Develop Snakes & Ladders Game Complete Guide with Code & Design
Anurag Pandey
No ratings yet
An Artist's Guide to Programming: A Graphical Introduction
From Everand
An Artist's Guide to Programming: A Graphical Introduction
Jim Parker
No ratings yet
BCM User Guide v9.4 July 2018
No ratings yet
BCM User Guide v9.4 July 2018
99 pages
93512information Retrieval LecturesNotes2024
No ratings yet
93512information Retrieval LecturesNotes2024
153 pages
WP Practical Sample Questions Solutions
No ratings yet
WP Practical Sample Questions Solutions
16 pages
Fmrte Guide
100% (1)
Fmrte Guide
14 pages
EMR Final 2022
No ratings yet
EMR Final 2022
4 pages
Msp430G2553 Launchpad™ Development Kit (MSP Exp430G2Et) : User'S Guide
No ratings yet
Msp430G2553 Launchpad™ Development Kit (MSP Exp430G2Et) : User'S Guide
26 pages
Sharf e Mushtari Lohe PDF
No ratings yet
Sharf e Mushtari Lohe PDF
18 pages
REA_JET_WebGUI_V3.5.xxx_002_04-11-2015_EN
No ratings yet
REA_JET_WebGUI_V3.5.xxx_002_04-11-2015_EN
150 pages
Information & Communication Technology - 1 (Ict-1) Computer Fundamentals and Office Tools II Semester
No ratings yet
Information & Communication Technology - 1 (Ict-1) Computer Fundamentals and Office Tools II Semester
60 pages
(ETI MCQ) Emerging Trends in Computer Eng and Information Technology MCQ Chapter 1 - Artificial Intelligence
No ratings yet
(ETI MCQ) Emerging Trends in Computer Eng and Information Technology MCQ Chapter 1 - Artificial Intelligence
22 pages
Cloning and Setup Snesrev - Zelda3 Wiki (10 - 9 - 2022 8 - 47 - 19 PM)
No ratings yet
Cloning and Setup Snesrev - Zelda3 Wiki (10 - 9 - 2022 8 - 47 - 19 PM)
4 pages
Burhan CV
No ratings yet
Burhan CV
3 pages
NIS S23 Model Ans
No ratings yet
NIS S23 Model Ans
25 pages
DCM 9000 Series
No ratings yet
DCM 9000 Series
714 pages
Oo Usb650
No ratings yet
Oo Usb650
26 pages
Customer Interface Tables
No ratings yet
Customer Interface Tables
30 pages
Two Sides of The Same Coin: Large-Scale Measurements of Builder and Rollup After EIP-4844
No ratings yet
Two Sides of The Same Coin: Large-Scale Measurements of Builder and Rollup After EIP-4844
13 pages
My Resumje
100% (1)
My Resumje
5 pages
Studio One: Optimization, Stability and Performance
No ratings yet
Studio One: Optimization, Stability and Performance
12 pages
Solutions To Set 3
No ratings yet
Solutions To Set 3
6 pages
1) Shell Script To Print Your Name: Sub:Operating System
No ratings yet
1) Shell Script To Print Your Name: Sub:Operating System
27 pages
Norme Internationale: Partie 2
No ratings yet
Norme Internationale: Partie 2
2 pages
Winter 2020 Paper Solution - Dbms
No ratings yet
Winter 2020 Paper Solution - Dbms
25 pages
Sitecore 10.2docx
No ratings yet
Sitecore 10.2docx
62 pages
SRS RPC
No ratings yet
SRS RPC
64 pages
OOPJ Question Bank
No ratings yet
OOPJ Question Bank
5 pages
Solving Constraint Satisfaction Problems (CSPS) Using Search
No ratings yet
Solving Constraint Satisfaction Problems (CSPS) Using Search
39 pages
06c2 Moodle Add An Activity Quiz XML
No ratings yet
06c2 Moodle Add An Activity Quiz XML
11 pages