Scrapy
Scrapy (/ˈskreɪpaɪ/[2] SKRAY-peye) is a free and open-source web-
Scrapy
crawling framework written in Python and developed in
Cambuslang. Originally designed for web scraping, it can also be
used to extract data using APIs or as a general-purpose web
crawler.[3] It is currently maintained by Zyte (formerly Developer(s) Zyte (formerly
Scrapinghub), a web-scraping development and services company. Scrapinghub)
Scrapy project architecture is built around "spiders", which are Initial release 26 June 2008
self-contained crawlers that are given a set of instructions. Stable release 2.9.0[1] / 8
Following the spirit of other don't repeat yourself frameworks, May 2023
such as Django,[4] it makes it easier to build and scale large Repository [Link]
crawling projects by allowing developers to reuse their code.
/scrapy/scrapy
The Scrapy framework provides you with powerful features such ([Link]
as auto-throttle, rotating proxies and user-agents, allowing you com/scrapy/scr
scrape virtually undetected across the net. Scrapy also provides a apy)
web-crawling shell, which can be used by developers to test their
Written in Python
assumptions on a site’s behavior.[5]
Operating system Windows,
Some well-known companies and products using Scrapy are: macOS, Linux
Lyst,[6][7] [Link],[8] Sayone Technologies,[9] Sciences Po Type Web crawler
Medialab,[10] [Link]’s World Government Data site.[11]
License BSD License
Website [Link] (htt
History ps://[Link]
g)
Scrapy was born at London-based web-aggregation and e-
commerce company Mydeco, where it was developed and
maintained by employees of Mydeco and Insophia (a web-consulting company based in Montevideo,
Uruguay). The first public release was in August 2008 under the BSD license, with a milestone 1.0 release
happening in June 2015.[12] In 2011, Zyte (formerly Scrapinghub) became the new official
maintainer.[13][14]
References
1. "Release 2.9.0" ([Link] 8 May 2023. Retrieved
31 May 2023.
2. Commit 975f150 ([Link]
e04433d9811dd)
3. Scrapy at a glance ([Link]
4. "Frequently Asked Questions" ([Link]
m-django). Frequently Asked Questions, Scrapy 2.8.0 documentation. Retrieved 28 July
2015.
5. "Scrapy shell" ([Link] Retrieved 28 July 2015.
6. Bell, Eddie; Heusser, Jonathan. "Scalable Scraping Using Machine Learning" ([Link]
[Link]/web/20160604082034/[Link]
Archived from the original ([Link] on 4 June
2016. Retrieved 28 July 2015.
7. Scrapy | Companies using Scrapy ([Link]
8. Montalenti, Andrew (October 27, 2012). "Web Crawling & Metadata Extraction in Python" (htt
ps://[Link]/amontalenti/web-crawling-and-metadata-extraction-in-python). Web
Crawling & Metadata Extraction in Python - Speaker Deck. Retrieved May 11, 2015.
9. "Scrapy Companies" ([Link] Scrapy | Companies using Scrapy.
10. Hyphe v0.0.0: the first release of our new webcrawler is out! ([Link]
[Link]/blog/hyphe-v0-0-0-the-first-release-of-our-new-webcrawler-is-out/)
11. Ben Firshman [@bfirsh] (21 January 2010). "World Govt Data site uses Django, Solr,
Haystack, Scrapy and other exciting buzzwords [Link]/5jU3La #opendata #datastore" (https://
[Link]/bfirsh/status/8025368963) (Tweet) – via Twitter.
12. Medina, Julia (19 June 2015). "Scrapy 1.0 official release out!" ([Link]
rum/#!topic/scrapy-users/sMbBVIq0sko). scrapy-users (Mailing list).
13. Hoffman, Pablo (2013). List of the primary authors & contributors ([Link]
crapy/blob/master/AUTHORS). Retrieved 18 November 2013.
14. Interview Scraping Hub ([Link]
webcrawling/).
External links
Official website ([Link]
Scrapy Tutorial Series ([Link]
Retrieved from "[Link]