Data Preprocessing & Visualization1
Data Preprocessing & Visualization1
Initial Data:
ID Age Salary Department Experience
0 1 25.0 50000.0 Sales 2
1 2 30.0 60000.0 Engineering 5
2 3 22.0 45000.0 Sales 1
3 4 35.0 NaN HR 10
4 5 28.0 70000.0 Engineering 4
5 6 40.0 80000.0 HR 15
6 7 38.0 75000.0 Sales 12
7 8 NaN 62000.0 Engineering 7
8 9 45.0 90000.0 HR 20
9 10 32.0 54000.0 Sales 6
2) Handling Outliers
In [6]: # Remove outliers from 'Salary' using the IQR method
Q1 = df['Salary'].quantile(0.25)
Q3 = df['Salary'].quantile(0.75)
IQR = Q3 - Q1
df = df[~((df['Salary'] < (Q1 - 1.5 * IQR)) | (df['Salary'] > (Q3 + 1.5 * IQR)))]
print("\nData after removing outliers:\n", df)
# Histogram of Age
plt.subplot(2, 2, 2)
sns.histplot(df['Age'], kde=True)
plt.title('Histogram of Age')
plt.tight_layout()
plt.show()
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, Categoric
alDtype) instead
if pd.api.types.is_categorical_dtype(vector):
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, Categoric
alDtype) instead
if pd.api.types.is_categorical_dtype(vector):
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, Categoric
alDtype) instead
if pd.api.types.is_categorical_dtype(vector):
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN befor
e operating instead.
with pd.option_context('mode.use_inf_as_na', True):
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, Categoric
alDtype) instead
if pd.api.types.is_categorical_dtype(vector):
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, Categoric
alDtype) instead
if pd.api.types.is_categorical_dtype(vector):
C:\Users\Reena\anaconda3\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, Categoric
alDtype) instead
if pd.api.types.is_categorical_dtype(vector):
In [ ]: