Active Product Sales Analysis using Matplotlib in Python
Last Updated :
10 Sep, 2024
Every modern company that engages in online sales or maintains a specialized e-commerce website now aims to maximize its throughput in order to determine what precisely their clients need in order to increase their chances of sales. The huge datasets handed to us can be properly analyzed to find out what time of day has the highest user activity in terms of transactions.
In this post, We will use Python Pandas and Matplotlib to analyze the insight of the dataset. We can use the column Transaction Date, in this case, to glean useful insights on the busiest time (hour) of the day. You can access the entire dataset here.
Stepwise Implementation
Step 1:
First, We need to create a Dataframe of the dataset, and even before that certain libraries have to be imported.
Python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Order_Details = pd.read_csv('Order_details(masked).csv')
Output:
Step 2:
Create a new column called Time that has the DateTime format after converting the Transaction Date column into it. The DateTime format, which has the pattern YYYY-MM-DD HH:MM:SS, can be customized however you choose. Here we're more interested in obtaining hours, so we can have an Hour column by using an in-built function for the same:
Python
# here we have taken Transaction
# date column
Order_Details['Time'] = pd.to_datetime(Order_Details['Transaction Date'])
# After that we extracted hour
# from Transaction date column
Order_Details['Hour'] = (Order_Details['Time']).dt.hour
Step 3:
We then require the "n" busiest hours. For that, we get the first "n" entries in a list containing the occurrence rates of the hours when the transaction took place. To further simplify the manipulation of the provided data in Python, we may utilize value counts for frequencies and tolist() to convert to list format. We are also compiling a list of the associated index values.
Python
# n =24 in this case, can be modified
# as per need to see top 'n' busiest hours
timemost1 = Order_Details['Hour'].value_counts().index.tolist()[:24]
timemost2 = Order_Details['Hour'].value_counts().values.tolist()[:24]
Step 4:
Finally, we stack the indices (hour) and frequencies together to yield the final result.
Python
tmost = np.column_stack((timemost1,timemost2))
print(" Hour Of Day" + "\t" + "Cumulative Number of Purchases \n")
print('\n'.join('\t\t'.join(map(str, row)) for row in tmost))
Step 5:
Before we can create an appropriate data visualization, we must make the list slightly more customizable. To do so, we gather the hourly frequencies and perform the following tasks:
Python
timemost = Order_Details['Hour'].value_counts()
timemost1 = []
for i in range(0,23):
timemost1.append(i)
timemost2 = timemost.sort_index()
timemost2.tolist()
timemost2 = pd.DataFrame(timemost2)
Step 6:
For data visualization, we will proceed with Matplotlib for better comprehensibility, as it is one of the most convenient and commonly used libraries. But, It is up to you to choose any of the pre-existing libraries like Matplotlib, Ggplot, Seaborn, etc., to plot the data graphically.
The commands written below are mainly to ensure that X-axis takes up the values of hours and Y-axis takes up the importance of the number of transactions affected, and also various other aspects of a line chart, including color, font, etc., to name a few.
Python
plt.figure(figsize=(20, 10))
plt.title('Sales Happening Per Hour (Spread Throughout The Week)',
fontdict={'fontname': 'monospace', 'fontsize': 30}, y=1.05)
plt.ylabel("Number Of Purchases Made", fontsize=18, labelpad=20)
plt.xlabel("Hour", fontsize=18, labelpad=20)
plt.plot(timemost1, timemost2, color='m')
plt.grid()
plt.show()
The results are indicative of how sales typically peak in late evening hours prominently, and this data can be incorporated into business decisions to promote a product during that time specifically.
Get the complete notebook link here
Colab Link : click here.
Dataset Link : click here.
Similar Reads
Plot a Pie Chart in Python using Matplotlib
A Pie Chart is a circular statistical plot that can display only one series of data. The area of the chart is the total percentage of the given data. Pie charts in Python are widely used in business presentations, reports, and dashboards due to their simplicity and effectiveness in displaying data d
8 min read
RFM Analysis Analysis Using Python
In business analytics one of the easiest ways to understand and categorize customers is through RFM analysis. RFM stands for Recency, Frequency and Monetary value which are three simple ways to look at customer behaviour:Recency: How recently did the customer make a purchase? The more recent, the mo
4 min read
Draw Multiple Y-Axis Scales In Matplotlib
Why areMatplotlib is a powerful Python library, with the help of which we can draw bar graphs, charts, plots, scales, and more. In this article, we'll try to draw multiple Y-axis scales in Matplotlib. Why are multiple Y-axis scales important?Multiple Y-axis scales are necessary when plotting dataset
6 min read
8 Types of Plots for Time Series Analysis using Python
Time series data Time series data is a collection of observations chronologically arranged at regular time intervals. Each observation corresponds to a specific time point, and the data can be recorded at various frequencies (e.g., daily, monthly, yearly). This type of data is very essential in many
10 min read
Plotting graph For IRIS Dataset Using Seaborn And Matplotlib
Matplotlib.pyplot library is most commonly used in Python in the field of machine learning. It helps in plotting the graph of large dataset. Not only this also helps in classifying different dataset. It can plot graph both in 2d and 3d format. It has a feature of legend, label, grid, graph shape, gr
2 min read
Visualising ML DataSet Through Seaborn Plots and Matplotlib
Working on data can sometimes be a bit boring. Transforming a raw data into an understandable format is one of the most essential part of the whole process, then why to just stick around on numbers, when we can visualize our data into mind-blowing graphs which are up for grabs in python. This articl
6 min read
Medical Insurance Price Prediction using Machine Learning - Python
You must have heard some advertisements regarding medical insurance that promises to help financially in case of any medical emergency. One who purchases this type of insurance has to pay premiums monthly and this premium amount varies vastly depending upon various factors. Medical Insurance Price P
7 min read
How to Add Axes to a Figure in Matplotlib with Python?
Matplotlib is a library in Python used to create figures and provide tools for customizing it. It allows plotting different types of data, geometrical figures. In this article, we will see how to add axes to a figure in matplotlib. We can add axes to a figure in matplotlib by passing a list argument
2 min read
Zomato Data Analysis Using Python
Understanding customer preferences and restaurant trends is important for making informed business decisions in food industry. In this article, we will analyze Zomatoâs restaurant dataset using Python to find meaningful insights. We aim to answer questions such as:Do more restaurants provide online
3 min read
Bitcoin Price Prediction using Machine Learning in Python
Machine learning proves immensely helpful in many industries in automating tasks that earlier required human labor one such application of ML is predicting whether a particular trade will be profitable or not.In this article, we will learn how to predict a signal that indicates whether buying a part
7 min read