Web Scraping Python - Chapter 1
Web Scraping Python - Chapter 1
W EB S CRAP IN G IN P YTH ON
Thomas Laetsch
Data Scientist, NYU
Business Savvy
What are businesses looking for?
Comparing prices
Satisfaction of customers
Many options!
Acquisition!
Thomas Laetsch
Data Scientist, NYU
The main example
Thomas Laetsch
Data Scientist, NYU
Do we have to?
Information within HTML tags can be valuable
The attribute name is followed by = followed by information assigned to that attribute, usually
quoted text.
Thomas Laetsch
Data Scientist, NYU
Another Slasher Video?
xpath = '/html/body/div[2]'
Simple XPath:
Brackets [] after a tag name tell us which of the selected siblings to choose.
xpath = '//table'
Direct to all table elements which are descendants of the 2nd div child of the body
element:
xpath = '/html/body/div[2]//table`