LAB01
LAB01
Index
Requirements Market Analysis) Data Analysis) Data Analysis)
factors affecting
affecting stock prices. conditions.
weather.
Use analysis tools such Use analysis tools such Use analysis tools such
as Google Looker, as Google Looker, as Google Looker,
Data Analysis
7 Power BI to visualize Power BI to visualize Power BI to visualize
with Tool
data and perform data and perform data and perform
regression analysis. regression analysis. regression analysis.
2. Find a Dataset:
o Use the search bar on the Kaggle homepage to find relevant datasets for your
project.
Stock Market Analysis: Search for terms like "stock market prices,"
"historical stock data," or "S&P 500 data."
Weather Data Analysis: Search for "weather data," "historical weather,"
or "climate data."
Healthcare Data Analysis: Search for "healthcare datasets," "patient
data," or "hospital data."
3. Download the Dataset:
o Navigate to the dataset page, click the "Download" button, and choose the CSV
format if available.
4. Install Kaggle API (Optional for Jupyter/Colab):
o If you want to download data directly into your Jupyter notebook or Google
Colab:
Install the Kaggle API using this command:
!pip install kaggle
Authenticate by uploading your Kaggle API token (download from your
Kaggle account settings).
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
Download the dataset directly:
!kaggle datasets download -d <dataset-name>
stock_data.head()
o Check for missing values:
stock_data.isnull().sum()
3. Data Cleaning:
o Remove or fill missing values:
stock_data['Date'] = pd.to_datetime(stock_data['Date'])
o Filter relevant columns:
weather_data.head()
o Check for missing values:
weather_data.isnull().sum()
3. Data Cleaning:
o Handle missing values:
weather_data.fillna(weather_data.mean(), inplace=True)
o Convert Date column to datetime:
weather_data['Date'] = pd.to_datetime(weather_data['Date'])
Step 3: Data Analysis
1. Basic Statistics:
weather_data.describe()
2. Visualization of Weather Trends:
plt.plot(weather_data['Date'], weather_data['Temperature'], label='Temperature')
plt.show()
3. Correlation Matrix:
weather_data.corr()
Step 4: Regression Analysis
1. Train a Regression Model:
X = weather_data[['Humidity', 'Pressure', 'Wind Speed']]
y = weather_data['Temperature']
health_data.head()
o Check for missing values:
health_data.isnull().sum()
3. Data Cleaning:
o Handle missing values and outliers.
o Create visualizations such as line charts for stock prices, weather trends, or patient
data.
2. Create Dashboards and Reports:
o Use the drag-and-drop interface to build interactive reports.
o For regression analysis, visualize predicted trends and compare them to actual
data.