Synopsis Format
Synopsis Format
PREDICTION
PROJECT SYNOPSIS
BACHELOR OF TECHNOLOGY
SUBMITTED BY
UNDER SUPERVISION
Of
Ms. Apoorva
Black Friday is an informal name for the Friday following Thanksgiving Day
in the United States, which is celebrated on the fourth Thursday of November. The
day after Thanksgiving has been regarded as the beginning of the United States
Christmas shopping season since 1952, although the term "Black Friday" did not
become widely used until more recent decades. Many stores offer highly promoted
sales on Black Friday and open very early, such as at midnight, or may even start
their sales at some time on Thanksgiving. The major challenge for a Retail store or
eCommerce business is to choose product price such that they get maximum profit at
the end of the sales. Our project deals with determining the product prices based on
the historical retail store sales data. After generating the predictions, our model will
help the retail store to decide the price of the products to earn more profits.
The technology which has been used in our project include Python
programming language along with most of its useful libraries such as Numpy,
Pandas, Matplotlib, Seaborn, Scikit-learn, etc. The programming environment that
has been used is the Jupyter Notebook available as part of the Anaconda software
package. Joblib library has been used for model saving purposes whereas Flask
library has been utilized for establishing a web application that performs prediction
on Black Friday Sales.
Primarily, the major strategies which have been adopted in this project are
data cleaning, data preprocessing, feature engineering, exploratory data analysis
(EDA), machine learning model training and evaluation and model deployment.
A retail company “ABC Private Limited” wants to understand the customer purchase
behaviour (specifically, purchase amount) against various products of different categories.
They have shared purchase summary of various customers for selected high-volume products
from last month. The data set also comprises customer demographics (age, gender, marital
status, city_type, stay_in_current_city), product details (product_id and product category)
and Total purchase_amount from last month.
Now, they want to build a model to predict the purchase amount of customer against various
products which will help them to create personalized offer for customers against different
products.
Feasibility Study: Retail companies recognize the need to analyze and predict their sales and
customer behavior against their product and product categories. Our project aims to help retail
companies create personalized deals and promotion for their customers through a data
framework that allows them to handle massive sales volume with more efficient models. In
this project we used Black Friday Sales data taken from a dataset from the Kaggle website
which contains nearly 550,000 observations analyzed with 10 features: qualitative and
quantitative. Because the predictor label is continuous, regression models are suited in this
case. We implemented a linear regression model and random forest. These machine learning
algorithms were used to predict future prices and sales.
The related work to predicting sales through machine learning algorithms has enormous
benefits for retail companies. We used the machine learning regression model, to achieve
better performance in predicting sales. We successfully established a machine learning
regression model which can estimate the gross Black Friday sales for a particular customer,
based on a distinct set of related and meaningful features, to a fair level of accuracy.
Over the past two decades, Black Friday has been the biggest shopping day of the year. Retail
stores are overrun by crowds, and many products are deeply discounted. Patterns have
emerged over time for this big shopping day. Our project provides a model for predicting sales
and targeting customers' behavior through machine learning regression and neural networks.
However, the model still faced challenges in accuracy.
There are many applications for making sales predictions in e-commerce, including
recommendation systems of products, time series prediction, and online promotion of
products. The work related to items explored multi task-based dynamic features through a
deep item network, focusing on the online advertisement of global shopping festivals
presented by Alibaba. The work of online shopping analyzed Chinese festivals through
collaborative filtering recommendations based on the user's online browsing behavior and
purchasing behaviors to improve sales. However, the results suggests that there is a need to
calculate customers' purchases and the number of customers against various products and
product preferences based on different variables. Moreover, there is a need to analyze
customer behavior to create a personalized offer to customers for purchasing specific products
during Black Friday sales.
Our goal was to build a system that recommends to customers what they usually buy based on
historical data. Moreover, it will help decision makers, data analysts, and data scientists to see
the popularity of product categories through other categories such as the customers' city/states,
age groups, and gender types. In this way, they can plan future sales, promotions, and
extended product lines.
We first used the process of data acquisition, then applied exploratory data analysis (EDA).
The next step was to perform feature engineering, in the forms of data preprocessing and data
cleaning.
Methodology/ Planning of work: (should not exceed 2 page)
The fundamental objective was to generate some valuable insights from data and
subsequently transform it into a format that is suitable for applying machine learning
algorithms.
The entire project workflow is based on the following steps:
9. Save the best performing model using the Joblib library for model
deployment into production.
Facilities required for proposed work: (should not exceed 200 words)
Software Required: Anaconda Navigator, Anaconda Command
Prompt, Jupyter Notebook
Hardware Required: None
The prediction model built will help to analyze the relationship among various attributes.
Black Friday Sales. Dataset is used for training and prediction. Black Friday Sales Dataset is
the online biggest dataset and the dataset is also accepted by various e-commerce websites.
The prediction model built will provide a prediction based on the age of the customer, city
category, occupation, etc. The prediction model is implemented based on models like linear
regression, ridge regression, lasso regression, Decision Tree Regressor, Random Forest
Regressor.
We have proposed a prediction model to analyze the customer's past spending and predict
the future spending of the customer. The dataset referred is Black Friday Sales Dataset from
Kaggle. They have machine learning models such as Linear Regression. The performance
evaluation measure Root Mean Squared Error (RMSE) is used to evaluate the models used.
Simple problems like regression can be solved by the use of simple models like linear
regression instead of complex neural network models. With traditional methods not being of
much help to business growth in terms of revenue, the use of Machine learning approaches
proves to be an important point for the shaping of the business plan taking into consideration
the shopping pattern of consumers.
Projection of sales concerning several factors including the sale of last year helps businesses
take on suitable strategies for increasing the sales of goods that are in demand.
Thus, the proposed model will predict the customer purchase on Black Friday and give the
retailer insight into customer choice of products. This will result in a discount based on
customer-centric choices thus increasing the profit to the retailer as well as the customer.
Bibliography:
[1]: https://www.researchgate.net.in
[2]: https://www.ijert.org.in
[3]: https://www.thinkindiaquaterly.com
1. The synopsis shall be computer typed (English- British, Font -Times Roman,
Size-12 point) and printed on A4 size paper.
2. The Synopsis shall be typed on one side only with double space with a margin 3.5
cm on the left, 2.5 cm on the top, and 1.25 cm on the right and at bottom.
3. In the synopsis, the title page [Refer sample sheet (inner cover)] should be given
first. This should be followed by index, notations/nomenclature.