Automated E-Commerce Price Comparison Website Using PHP XAMPP MongoDB Django and Web Scrapping
Automated E-Commerce Price Comparison Website Using PHP XAMPP MongoDB Django and Web Scrapping
Coimbatore, India
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on October 15,2023 at 12:55:30 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computer Communication and Informatics (ICCCI), Jan 23-25, 2023,
Coimbatore, India
Ref. No. Author Details Type of the website Features of the website Results
[6] Nimbalkar et al Spam bot Post-Comment similarity, 86% accuracy has been
the ratio of stop words, achieved by this model
and redundancy using the Decision tree
classifier
[7] Khatter et al Real-time bot HTTP response code and Bots were discovered
Inter arrival time before the session
expired
[8] Dharmik et al Web Crawler Percentage of requested 95% accuracy has been
images, duration of the achieved by this model
session, and the response
code from the web.
[9] Sharma et al. Chatbot Size of the Message and 100% accuracy has been
the delay time between achieved by this model
them
[10] Sarker et al. Web Scarper The entropy of inter- 91% accuracy has been
request time and achieved by this model
requested bytes
[11] Raza et al. Social Media bot Probability of transition The accuracy of the
features between the user model has increased by
and click streams an average of 91% in
total
[12] Chee et al. Price Scraping bot Twenty-five features as 90% accuracy has been
total requested pages, achieved using Cascade
session time, the standard neural network
deviation of time between
the requests
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on October 15,2023 at 12:55:30 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computer Communication and Informatics (ICCCI), Jan 23-25, 2023,
Coimbatore, India
PHASE 4:
III. MATERIALS AND METHODS
Develop the model architecture - The following section is
1. MongoDB to explain the structure and working of the machine. This
It is a document-oriented database that is classified provides a clear picture of the system, and how the machine
under NoSQL. The system can deal with large amounts of works and prevents it from becoming a machine that does not
unstructured data hence it is easier to use MongoDB as the solve the problems.
database. The extracted data using the scraper is stored in
the MongoDB database [13]. WEB CRAWLER :
Web Crawlers or Web Spiders are used to automatically
2. Django Web Framework search websites and gather information over the internet. The
Django is a python web framework. E-commerce product first aspect is, the re-fetched URLs will be sent to the scraper
comparison using web scraping is a product price for scraping.
comparison website that is made using the Django web
framework. Products that are requested by users are queried WEB SCRAPER :
in the database using a mapper mongo-engine which is It is used to extract the HTML statistics from URLs sent
object related [14-18]. by the crawler and use them for purposes that are not public.
PHASE 1: In this gadget, python libraries like requests and
beautifulsoup4 are used to achieve scraping. Beautifulsoup4
Data Collection and Analysis - A series of investigations is a python library used for HTML page parsing. Using this
yielded useful information on power and energy data, product information from various unique e-commerce
consumption. Also, the results that turned into results gained websites is extracted and stored in a database.
better know-how about what a rating website is, all about the IV. Proposed Methodology
way it helps people to solve problems before buying home
groceries and examples for current rating websites to test
against competitors. Previous research statistics are obtained
from student-written term papers and have been defined in
the literature review phase information earlier. Meanwhile,
customer statistics were obtained through a survey and
interview conducted by online respondents in addition to
assembly [19-22].
PHASE 2:
Survey of any existing similar system - The next step is to
look at the results achieved and test if there is any machine
that's similar or comparable. The main theme of conducting
the study on a comparable current machine is to know how it
works, what idea is being worked within the machine, what
is calculated with the help of the machine, and how the
difficulties are handled by the machine [23-27].
Fig.1 Proposed Architecture
PHASE 3: In this paper, we are going to develop a website where
Design of the main components of the system - After the prices of the desired products from Amazon and Flipkart
studies on current comparable systems, the next thing to do are displayed using PHP, XAMPP, MongoDB, and web
is to realize what will be the main thing that will make the scraping. Fig.1 describes the architecture of the proposed
machine get advanced. For the operation of this website to system.
use its services, they must log in with their basic records, i.e., We’ve used ‘XAMPP’ for the implementation of our
Name, E-mail, etc. Customers who are registered can be code. This package includes MySQL, Apache web server,
mechanically given access to the newsletter of this website. Perl, FTP server, PHP, and also phpMyAdmin.
Customers will be able to select a product and related listings PHP is a scripting language used for servers and a very
can be displayed. In addition, customers can also upload useful tool used for making interactive and interactive web
additional information such as their preferred goods to their pages. No dataset is present in the project, the data is
profile, so that the current charge of favorite gadgets can be scrapped on its own by web scraping and the results are
sent to the customer individually in addition to the day's displayed. This software collection also includes many other
merchandise mail. Thus, the consumer may be able to obtain components, which are explained below.
the records in which he participates without delay. The main XAMPP Control Panel:
additions of the machine are A flexible database for storing It helps to control and regulate various components
goods and customer records. A user will be able to search for present in XAMPP. The latest update of this software is
the product he is involved in [28-33]. Version 3.2.1.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on October 15,2023 at 12:55:30 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computer Communication and Informatics (ICCCI), Jan 23-25, 2023,
Coimbatore, India
Why PHP?
Like MySQL, PHP is free to use and also open source. Using Simple HTML Dom
Packages such as XAMPP already have a web server, It is a library developed for PHP versions and it helps us
MySQL, and PHP among others. This makes PHP a cost- to access the page’s content in a much easier way with
effective scripting language when compared to languages selectors. You only need the simple HTML dom.php file
such as CFML or ASP. One of the other advantages of PHP from the zip file. This file should be placed in the same
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on October 15,2023 at 12:55:30 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computer Communication and Informatics (ICCCI), Jan 23-25, 2023,
Coimbatore, India
Installing PHP-CURL
It's not always necessary, but for more advanced requests
you should send different headers. Using the PHP-CURL
library will help. Don't forget to restart the Apache server
after installing the library. Fig.2, 3, and 4 indicate the XAMP
model for the control panel and Web Scraping using PHP.
Scraping the content from various websites :
1. Check the website content Fig 5. The interface of the Price Comparison Website
Most web content is displayed using HTML. Since we
need to extract specific content from the HTML source, it is
also necessary to understand it. First, we need to check what
the source of the page looks like to know what elements to
extract from the page.
In Google Chrome, you can do this by right-clicking on
the element you want to extract and selecting "Inspect
Element". This should open a window in your browser with
the page source and rendered element styles. In this window,
we only need to check the "Elements" tab, which will show
us how the HTML home of the page is structured.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on October 15,2023 at 12:55:30 UTC from IEEE Xplore. Restrictions apply.
2023 International Conference on Computer Communication and Informatics (ICCCI), Jan 23-25, 2023,
Coimbatore, India
3. Al-Mushayt, O. S., Gharibi, W., & Armi, N. (2022). An E- Technologies in Computer Engineering (pp. 573-583).
Commerce Control Unit for Addressing Online Transactions in Springer, Cham.
Developing Countries: Saudi Arabia—Case Study. IEEE 21. Brintha, N. C., Nagaraj, P., Tejasri, A., Durga, B. V., Teja, M.
Access, 10, 64283-64291. T., & Kumar, M. N. V. P. (2022, June). A Food
4. Zhang, X., Shen, K., Zhang, C., Fan, X., Xiao, Y., He, Z., ... & Recommendation System for Predictive Diabetic Patients using
Wu, L. (2022). Scenario-based Multi-product Advertising ANN and CNN. In 2022 7th International Conference on
Copywriting Generation for E-Commerce. arXiv preprint Communication and Electronics Systems (ICCES) (pp. 1364-
arXiv:2205.10530. 1371). IEEE.
5. Niemir, M., & Mrugalska, B. (2022). Product Data Quality in e- 22. Nagaraj, P., Deepalakshmi, P., Muneeswaran, V., & Muthamil
Commerce: Key Success Factors and Challenges. Production Sudar, K. (2022). Sentiment Analysis on Diabetes Diagnosis
Management and Process Control, 36, 1-12. Health Care Using Machine Learning Technique. In Congress
6. Nimbalkar, T. R., Bhadane, L. G., Kharatmal, D. B., & Borase, on Intelligent Systems (pp. 491-502). Springer, Singapore.
J. P. E-COMMERCE PORTAL WITH RECOMMENDATION 23. Nagaraj, P., Muneeswaran, V., Reddy, L. V., Upendra, P., &
SYSTEM FOR SURGICAL EQUIPMENT. Journal Reddy, M. V. V. (2020, May). Programmed multi-classification
homepage: www. ijrpr. com ISSN, 2582, 7421. of brain tumor images using deep neural network. In 2020 4th
7. Khatter, H., Sharma, A., & Kushwaha, A. K. (2022, July). Web international conference on intelligent computing and control
Scraping based Product Comparison Model for E-Commerce systems (ICICCS) (pp. 865-870). IEEE.
Websites. In 2022 IEEE International Conference on Data 24. Nagaraj, P., Rajasekaran, M. P., Muneeswaran, V., Sudar, K.
Science and Information System (ICDSIS) (pp. 1-6). IEEE. M., & Gokul, K. (2020, August). VLSI implementation of
8. Dharmik, H., Padmane, P., Dhoke, K., Chambhare, S., & image compression using TSA optimized discrete wavelet
Kohad, D. A Review on E-commerce Price Evaluation System. transform techniques. In 2020 Third International Conference
9. Sharma, D. K., Lohana, S., Arora, S., Dixit, A., Tiwari, M., & on Smart Systems and Inventive Technology (ICSSIT) (pp. 667-
Tiwari, T. (2022). E-Commerce product comparison portal for 670). IEEE.
classification of customer data based on data mining. Materials 25. Vamsi, A. M., Deepalakshmi, P., Nagaraj, P., Awasthi, A., &
Today: Proceedings, 51, 166-171. Raj, A. (2020). IOT based autonomous inventory management
10. Sarker, K. U., Saqib, M., Hasan, R., Mahmood, S., Hussain, S., for warehouses. In EAI International Conference on Big Data
Abbas, A., & Deraman, A. (2022). A Ranking Learning Model Innovation for Sustainable Cognitive Computing (pp. 371-376).
by K-Means Clustering Technique for Web Scraped Movie Springer, Cham.
Data. Computers, 11(11), 158. 26. Muneeswaran, V., Nagaraj, P., Dhannushree, U., Ishwarya
11. Raza, M. Z., Verma, P., Abirami, G., & Girija, R. (2022). Lakshmi, S., Aishwarya, R., & Sunethra, B. (2021). A
Prediction of Consumer Purchase Intention Using E-Commerce Framework for Data Analytics-Based Healthcare Systems.
Web Data. Telematique, 6679-6690. In Innovative Data Communication Technologies and
12. Chee, C. C. F. C., Chiew, K. L., Sarbini, I. N., & Jing, E. K. H. Application (pp. 83-96). Springer, Singapore.
(2022). Data Analytics Approach for Short-term Sales Forecasts 27. Nagaraj, P., Rao, J. S., Muneeswaran, V., & Kumar, A. S. (2020,
Using Limited Information in E-commerce Marketplace. Acta May). Competent ultra data compression by enhanced features
Informatica Pragensia, 11(3). excerption using deep learning techniques. In 2020 4th
13. Nagaraj, P., & Deepalakshmi, P. (2020). A framework for e- International Conference on Intelligent Computing and Control
healthcare management service using recommender Systems (ICICCS) (pp. 1061-1066). IEEE.
system. Electronic Government, an International 28. Muneeswaran, V., Nagaraj, M. P., Rajasekaran, M. P.,
Journal, 16(1-2), 84-100. Chaithanya, N. S., Babajan, S., & Reddy, S. U. (2021, July).
14. Nagaraj, P., Deepalakshmi, P., Mansour, R. F., & Almazroa, A. Indigenous Health Tracking Analyzer Using IoT. In 2021 6th
(2021). Artificial flora algorithm-based feature selection with International Conference on Communication and Electronics
gradient boosted tree model for diabetes Systems (ICCES) (pp. 530-533). IEEE.
classification. Diabetes, Metabolic Syndrome and Obesity: 29. Muneeswaran, V., BenSujitha, B., Sujin, B., & Nagaraj, P.
Targets and Therapy, 14, 2789. (2020). A compendious study on security challenges in big data
15. Pa, N., Mb, A., Kb, B., & Ab, D. (2020). Analysis of data and approaches of feature selection. International Journal of
mining techniques in diagnalising heart disease. Intelligent Control and Automation, 13(3), 23-31.
Systems and Computer Technology, 37, 257. 30. Varma, C. G., Nagaraj, P., Muneeswaran, V., Mokshagni, M.,
16. Vb, S. K. (2020). Perceptual image super resolution using deep & Jaswanth, M. (2021, May). Astute Segmentation and
learning and super resolution convolution neural networks Classification of leucocytes in blood microscopic smear images
(SRCNN). Intelligent Systems and Computer using titivated K-means clustering and robust SVM techniques.
Technology, 37(3). In 2021 5th International Conference on Intelligent Computing
17. Nagaraj, P., & Deepalakshmi, P. (2021). Diabetes Prediction and Control Systems (ICICCS) (pp. 818-824). IEEE.
Using Enhanced SVM and Deep Neural Network Learning 31. Sudar, K. M., Nagaraj, P., Deepalakshmi, P., & Chinnasamy, P.
Techniques: An Algorithmic Approach for Early Screening of (2021, January). Analysis of Intruder Detection in Big Data
Diabetes. International Journal of Healthcare Information Analytics. In 2021 International Conference on Computer
Systems and Informatics (IJHISI), 16(4), 1-20. Communication and Informatics (ICCCI) (pp. 1-5). IEEE.
18. Nagaraj, P., & Deepalakshmi, P. (2022). An intelligent fuzzy 32. Sudar, K. M., Deepalakshmi, P., Nagaraj, P., & Muneeswaran,
inference rule‐based expert recommendation system for V. (2020, November). Analysis of Cyberattacks and its
predictive diabetes diagnosis. International Journal of Imaging Detection Mechanisms. In 2020 Fifth International Conference
Systems and Technology. on Research in Computational Intelligence and Communication
19. Nagaraj, P., Deepalakshmi, P., & Ijaz, M. F. (2022). Optimized Networks (ICRCICN) (pp. 12-16). IEEE.
adaptive tree seed Kalman filter for a diabetes recommendation 33. Sudar, K. M., Deepalakshmi, P., Ponmozhi, K., & Nagaraj, P.
system—bilevel performance improvement strategy for (2019, December). Analysis of Security Threats and
healthcare applications. In Cognitive and Soft Computing Countermeasures for various Biometric Techniques. In 2019
Techniques for the Analysis of Healthcare Data (pp. 191-202). IEEE International Conference on Clean Energy and Energy
Academic Press. Efficient Electronics Circuit for Sustainable Development
20. Vignesh, K., & Nagaraj, P. (2022). Analysing the Nutritional (INCCES) (pp. 1-6). IEEE.
Facts in Mc. Donald’s Menu Items Using Exploratory Data
Analysis in R. In International Conference on Emerging
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SURATHKAL. Downloaded on October 15,2023 at 12:55:30 UTC from IEEE Xplore. Restrictions apply.