Python scrapping task
Python scrapping task
Sellers section. The scraper should authenticate using user credentials and collect details of
products on sale in any 10 categories, focusing on those with discounts greater than 50%. The
extraction should be limited to the top 1500 best-selling products within each category.
Requirements:
1. Authentication:
○ Utilize valid Amazon credentials to log in.
2. Data Collection:
○ For each of the 10 categories, scrape the following details for products
meeting the discount and best selling products criteria:
■ Product Name
■ Product Price
■ Sale Discount
■ Best Seller Rating
■ Ship From
■ Sold By
■ Rating
■ Product Description
■ Number Bought in the Past Month (if available)
■ Category Name
■ All Available Images
3. Data Storage:
○ Store the scraped data into CSV or JSON file in structured format.
4. Technical Specifications:
○ Use Python with the Selenium library for web scraping.
○ Implement robust error handling to manage exceptions during the scraping
process.
○ Ensure compliance with Amazon's terms of service regarding data scraping.
Deliverables:
.
Sample Urls
● https://www.amazon.in/gp/bestsellers/?ref_=nav_em_cs_bestsellers_0_1_1_2
● https://www.amazon.in/gp/bestsellers/kitchen/ref=zg_bs_nav_kitchen_0
● https://www.amazon.in/gp/bestsellers/shoes/ref=zg_bs_nav_shoes_0
● https://www.amazon.in/gp/bestsellers/computers/ref=zg_bs_nav_computers_0
● https://www.amazon.in/gp/bestsellers/electronics/ref=zg_bs_nav_electronics_0