0% found this document useful (0 votes)
13 views5 pages

Web Scraping Mini Project 2 - Jupyter Notebook

This document outlines a mini project on web scraping using Python's BeautifulSoup and requests libraries to extract data from Cars.com. The project focuses on gathering information about certified used Mercedes-Benz vehicles, including name, mileage, rating, review count, and price. The initial steps include sending an HTTP request to the website, parsing the HTML content, and targeting specific data elements for extraction.

Uploaded by

Shah Alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

Web Scraping Mini Project 2 - Jupyter Notebook

This document outlines a mini project on web scraping using Python's BeautifulSoup and requests libraries to extract data from Cars.com. The project focuses on gathering information about certified used Mercedes-Benz vehicles, including name, mileage, rating, review count, and price. The initial steps include sending an HTTP request to the website, parsing the HTML content, and targeting specific data elements for extraction.

Uploaded by

Shah Alam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

9/24/23, 1:22 PM WEB SCRAPING MINI PROJECT 2 - Jupyter Notebook

IMPORTS

In [1]: from bs4 import BeautifulSoup


import requests
import pandas as pd

HTTPS REQUESTS

Store website in url

In [2]: website = "https://www.cars.com/shopping/results/?stock_type=cpo&makes%5B%5D=mercedes_benz&models%5B%5D=&zip="

Get requests

In [3]: response = requests.get(website)

Status code

In [4]: response.status_code

Out[4]: 200

Soup object

In [5]: soup = BeautifulSoup(response.content, 'html.parser')

localhost:8888/notebooks/Documents/WEB SCRAPING MINI PROJECT 2.ipynb# 1/5


9/24/23, 1:22 PM WEB SCRAPING MINI PROJECT 2 - Jupyter Notebook

In [6]: soup
sellers directly from 9,245 Mercedes-Benz models nationwide." name="description"/>
<meta content="noindex, nofollow" name="robots">
<meta content="Cars.com" property="og:site_name"/>
<meta content="website" property="og:type"/>
<meta content="Certified Used Mercedes-Benz for Sale | Cars.com" property="og:title"/>
<meta content="https://www.cars.com/shopping/results/" property="og:url"/>
<meta content="Shop Mercedes-Benz vehicles for sale at Cars.com. Research, compare, and save listings, or contact
sellers directly from 9,245 Mercedes-Benz models nationwide." property="og:description"/>
<meta content="https://assets.carsdn.co/image/upload/f_auto,q_auto/v1685047778/Cars-PD_OG-IMG.webp" property="og:i
mage"/>
<meta content="290349283473" property="fb:app_id"/>
<link href="/images/favicon_1.png" rel="apple-touch-icon"/>
<meta content="app-id=353263352" name="apple-itunes-app"/>
<meta content="@carsdotcom" name="twitter:site"/>
<meta content="summary_large_image" name="twitter:card"/>
<meta content="Certified Used Mercedes-Benz for Sale | Cars.com" name="twitter:title"/>
<meta content="Shop Mercedes-Benz vehicles for sale at Cars.com. Research, compare, and save listings, or contact
sellers directly from 9,245 Mercedes-Benz models nationwide." name="twitter:description"/>
<meta content="https://assets.carsdn.co/image/upload/f_auto,q_auto/v1685047778/Cars-PD_OG-IMG.webp" name="twitter:
image"/>

Results

In [7]: results = soup.find_all('div',{'class':'vehicle-card'})

In [8]: len(results)

Out[8]: 21

localhost:8888/notebooks/Documents/WEB SCRAPING MINI PROJECT 2.ipynb# 2/5


9/24/23, 1:22 PM WEB SCRAPING MINI PROJECT 2 - Jupyter Notebook

In [9]: results[0]

Out[9]: <div class="vehicle-card inventory-ad" data-inventory-ad="true" data-koddi-click-tracking-url="https://cars.koddi.


io/event-collection/beacon?action=click&amp;trackingData={trackingData}&amp;rank={rank}&amp;clientName=Cars&amp;be
aconIssued=2023-09-23T22:24:44Z" data-koddi-deeplink-params="" data-koddi-impression-tracking-url="https://cars.ko
ddi.io/event-collection/beacon?action=impression&amp;trackingData={trackingData}&amp;rank={rank}&amp;ts={ts}&amp;c
lientName=Cars&amp;beaconIssued=2023-09-23T22:24:44Z" data-koddi-listing-id="7bb473d2-fbe6-4069-9b4a-bade7a131f0f"
data-koddi-tracking-data="0cI0dLPw2/1+JVtm5aX6SlYI981O7ItKUkmv3WXW1MFwWbRJDJn2MPvJzb+oYUMw85/r1tRczOH1o7keVCf2YxFM
UZc5EjUEqqP/K3eR5FhQ9ogu8IJFTd4BzT3QzIPkROZfxmhaAugMIeyPwQjoEW2wk4vaDM69w4kc3Ak1QkHdQFdE9hAMj7Tdqw/sCaPkxcwJg4/VS2
vSGmh1Qu63Wtr1ebfBM4qrt6/5XPy8USOMo9K7OGr7XTBR3Txblqgq5/AAoBm9RDynTOgj6g7UujPXDtKGKy6Pvvrb+MXVJdF3u6t4txUM1stlhEIH
a7TZX3BVU10CfUGG3vysLv13UGsU4ZYSrQfMspojgrn9ySvBKYnyJduwDt/shH0APrMntOoCCZMtQD3qSIM+PiDw/30ILShGMmYY3Xq65vQXq3GSRj
zRLpQeHw+FNNEbW962Xiz4tMPahGZr+J5kwtI0a9ridjsEHUQQcMa2iQ3Y+bfHgpUmQ9wWLc5qv1WO5RyJ7Zj+HEelOgrso2dIxTyQ74gA4pmkBVPF
5SGcIh+KJuUtzH1oX8c5MqDTMGnaK3ZBTWPeafKxdlnQ4ywTfPUEl4j9j7LWvgBLCczCtb6EqrWabszy8aj8oPx2biR52gMb8XTzOHasr9WA+kEjML
QuemqMMx4zhR44U/rM/Mf5CV948eI9+bWotqYnz2gGZe/wsSWwZm78B50K9J3LlwoPO6u1fPVa+rFdg7gtaH9l00z7mnVUyH8HC6v9gUPSXw0cVPBb
G3OrgnLPi7wmgpwnto1r21ctGn/k/1JBEIK1NB0ly+2CQmVFWfV75p02SSzlmrgg3IsGtvGHuk48cT7EyqNag8T+fZL1TCvfD7kQj56SaYdPp+ua/D
VSf3C60hrhDqtZnWJFerFxJo7e8ZbieO9e8lmygW1pI3ak4gQk1BYcvvpS1Wf1RS8jGK9ZvfMvN/XJ0Ycpk8lfOCwBX+QyjpFJXpluMS4D3K4URo0D
sHD6BJP9IfQ5IcCjjYRY+0IZsdomGXc7DDGHZ8amUM9UCNRIuTojp3XYw62fAWmT+Pbnzse52JJOBSEdMHNDU1YUkmic9DjaEMgVxlxnCVVO6CThGw
nhT1zPhL7WWtjMZlUkbp6BKLoxcPrs6GELqljttxEgZnLlO2Wrcz+wXr4GkHzKFEkaH0w/gvkSETb4Sk2g1IYukl2EHfmm3Ubeo4KAfcSjEjCk0TwJ
VdwL4cEUDAFJl5HzmMVdvkEx+bYCUL7yq/ztyJ1Jm39IRuCzI1fFHXSP4smTPD3RgMRV6AEGNpcYEMFdxIOjUU836mIfBgrZ1oqfgC0RIPb9kuG1a6
Z6ajehNHcUQiJIQUWetnRng+8tdSL3TpHPtMqAAQSGFyf6/YxqMmdh6V7DEmmY8ZI6d3B2h3L6M048iBt/M0gKLv4Og7btcIkkNy5dFxWMt8wkIxxr
LEIeZkFX/EJ25IVH0frO2xyzU1mNAdDCR8up6TMKfXzcCtP94q1erFnth9B3spDo7xKkWEooda/K5/QRIqz6iHtV5yEFRS2kBSNcle1kNa9Zm0RdZO
t2E2jiNZR9NtM CNhD ij JV BA S0IIk ClkOMI 6G6 0 FS 3QZ 3 SfN4//dEdVH 3PD7LDiO H/A1I 0LQ2AO D B Wk 1h OjAX5

Target necessary data

Name

Mileage

Rating

Review count

Price

localhost:8888/notebooks/Documents/WEB SCRAPING MINI PROJECT 2.ipynb# 3/5


9/24/23, 1:22 PM WEB SCRAPING MINI PROJECT 2 - Jupyter Notebook

In [ ]: ​

Name

In [10]: results[0].find('h2').get_text()

Out[10]: '2023 Mercedes-Benz GLC 300 Base 4MATIC'

Mileage

In [11]: results[0].find('div',{'class':'mileage'}).get_text()

Out[11]: '3,040 mi.'

Rating

In [12]: results[0].find('span', {'class':'sds-rating__count'}).get_text()

Out[12]: '4.7'

Review count

In [13]: results[0].find('span', {'class':'sds-rating__link sds-button-link'}).get_text()

Out[13]: '(784 reviews)'

Price

In [14]: results[0].find('span',{'class':'primary-price'}).get_text()

Out[14]: '$58,999'

localhost:8888/notebooks/Documents/WEB SCRAPING MINI PROJECT 2.ipynb# 4/5


9/24/23, 1:22 PM WEB SCRAPING MINI PROJECT 2 - Jupyter Notebook

localhost:8888/notebooks/Documents/WEB SCRAPING MINI PROJECT 2.ipynb# 5/5

You might also like