0% found this document useful (0 votes)
48 views6 pages

Web Scraping Assignment Ebay

The document discusses web scraping and provides examples of scraping product data from different websites. It demonstrates scraping attributes like name, price, shipping fee and image from an eBay page for Apple accessories. The code extracts over 10 records into a CSV file with the product name, price, shipping fee and image URL.

Uploaded by

Alya Rusmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views6 pages

Web Scraping Assignment Ebay

The document discusses web scraping and provides examples of scraping product data from different websites. It demonstrates scraping attributes like name, price, shipping fee and image from an eBay page for Apple accessories. The code extracts over 10 records into a CSV file with the product name, price, shipping fee and image URL.

Uploaded by

Alya Rusmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Web Scraping

Web scraping is a process of extracting valuable data from a website. You have to scrape a dataset from the retail link system to analyse the data. Find one
product in the retail link system and extract four attributes with at least ten records from the selected web pages. For example:

Website: Mudah.com

Product: iPhone

Attributes: Model, Specification, Price, Image

Send your Python codes by copying your codes into ms word. Send your data output in Microsoft Excel.
Website: Ebay

Product: Accessories for Apple Tablets and eReaders

Attributes: Product Name, Product Price, Shipping Fee (free or not), Image

from bs4 import BeautifulSoup as soup

from urllib.request import urlopen as uReq

my_url = 'https://www.ebay.com.my/b/Accessories-for-Apple-Tablets-and-eReaders/176970/bn_826388'

uClient = uReq(my_url)

page_html = uClient.read()

uClient.close()

page_soup = soup(page_html, 'html.parser')

page_soup

page_soup.findAll('li',{'class':'s-item s-item--large'})

item = page_soup.findAll('li',{'class':'s-item s-item--large'})

item

item[0]

item[0].findAll('div',{'class':'s-item__info clearfix'})

iteminfo = item[0].findAll('div',{'class':'s-item__info clearfix'})

iteminfo[0]

iteminfo[0].a
iteminfo[0].a.h3

iteminfo[0].a.h3.getText().strip()

itemname = iteminfo[0].a.h3.getText().strip()

itemname

item[0].findAll('span',{'class':'s-item__price'})

itemprice = item[0].findAll('span',{'class':'s-item__price'})

itemprice[0]

itemprice[0].getText().strip()

price = itemprice[0].getText().strip()

item[0].findAll('span',{'class':'s-item__shipping s-item__logisticsCost'})

log = item[0].findAll('span',{'class':'s-item__shipping s-item__logisticsCost'})

log[0]

log[0].getText().strip()

postage = log[0].getText().strip()

item[0].findAll('div',{'class':'s-item__image-wrapper'})

image = item[0].findAll('div',{'class':'s-item__image-wrapper'})

image[0]

image[0].img['src']

imagelink = image[0].img['src']

for i in item:
iteminfo = i.findAll('div',{'class':'s-item__info clearfix'})

itemname = iteminfo[0].a.h3.getText().strip()

print(itemname)

itemprice = i.findAll('span',{'class':'s-item__price'})

price = itemprice[0].getText().strip()

print(price)

log = i.findAll('span',{'class':'s-item__shipping s-item__logisticsCost'})

postage = log[0].getText().strip()

print(postage)

imagelink = image[0].img['src']

print(imagelink)

filename='ebay.csv'

f=open(filename, 'w')

for i in item:

iteminfo = i.findAll('div',{'class':'s-item__info clearfix'})


itemname = iteminfo[0].a.h3.getText().strip()

print(itemname)

f.write(itemname+',')

itemprice = i.findAll('span',{'class':'s-item__price'})

price = itemprice[0].getText().strip()

print(price)

f.write(price.replace(',','|')+',')

log = i.findAll('span',{'class':'s-item__shipping s-item__logisticsCost'})

postage = log[0].getText().strip()

print(postage)

f.write(postage.replace(',','|')+',')

imagelink = image[0].img['src']

print(imagelink)

f.write(imagelink+'\n')

f.close()

You might also like