Setup API for GeeksforGeeks user data using WebScraping and Flask
Last Updated :
25 Apr, 2019
Prerequisite:
WebScraping in Python,
Introduction to Flask
In this post, we will discuss how to get information about a GeeksforGeeks user using web scraping and serve the information as an API using Python's micro-framework,
Flask.
Step #1: Visit the auth profile
To scrape a website, the first step is to visit the website.
Step #2: Inspect the Page Source
In the above image, you can spot the
descDiv
div which contains user Data. In the following image, spot the four
mdl-grid
divs for four pieces of information.

Dive deep, spot the two blocks for attribute and its corresponding value. That's your user-profile data.

Find all this in the following function which returns all the data in the form of a dictionary.
Python3 1==
def get_profile_detail(user_handle):
url = "https://auth.geeksforgeeks.org/user/{}/profile".format(user_handle)
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html5lib')
description_div = soup.find('div', {'class': 'descDiv'})
if not description_div:
return None
user_details_div = description_div.find('div', {'class': 'mdl-cell'})
specific_details = user_details_div.find_all('div', {'class': 'mdl-grid'})
user_profile = {}
for detail_div in specific_details:
block = detail_div.find_all('div', {'class': 'mdl-cell'})
attribute = block[0].text.strip()
value = block[1].text.strip()
user_profile[attribute] = value
return {'user profile': user_profile}
Step #3: Articles and Improvements list
This time, try to spot the various tags yourself.

If you could spot the various elements of the HTML, you can easily write the code for scraping it too.
If you couldn't, here's the code for your help.
Python3 1==
def get_articles_and_improvements(user_handle):
articles_and_improvements = {}
url = "https://auth.geeksforgeeks.org/user/{}/articles".format(user_handle)
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html5lib')
contribute_section = soup.find('section', {'id': 'contribute'})
improvement_section = soup.find('section', {'id': 'improvement'})
contribution_list = contribute_section.find('ol')
number_of_articles = 0
articles = []
if contribution_list:
article_links = contribution_list.find_all('a')
number_of_articles = len(article_links)
for article in article_links:
article_obj = {'title': article.text,
'link': article['href']}
articles.append(article_obj)
articles_and_improvements['number_of_articles'] = number_of_articles
articles_and_improvements['articles'] = articles
improvement_list = improvement_section.find('ol')
number_of_improvements = 0
improvements = []
if improvement_list:
number_of_improvements = len(improvement_list)
improvement_links = improvement_list.find_all('a')
for improvement in improvement_links:
improvement_obj = {'title': improvement.text,
'link': improvement['href']}
improvements.append(improvement_obj)
articles_and_improvements['number_of_improvements'] = number_of_improvements
articles_and_improvements['improvements'] = improvements
return articles_and_improvements
Step #4: Setup Flask
The code for web scraping is completed. Now it's time to set-up our Flask Server. Here's the setup for the Flask app, along with all the necessary libraries needed for the entire script.
Python3 1==
from bs4 import BeautifulSoup
import requests
from flask import Flask, jsonify, make_response
app = Flask(__name__)
app.config['JSON_SORT_KEYS'] = False
Step #5: Setting Up for API
Now that we've got the appropriate functions, our only task is to combine both of their results, convert the dictionary to JSON and then serve it on the server.
Here's the code for an endpoint which serves the API on the basis of user handle it receives. Remember, we need to take care of the improper user handle, our endpoint can receive any time.
Python3 1==
@app.route('/<user_handle>/')
def home(user_handle):
response = get_profile_detail(user_handle)
if response:
response.update(get_articles_and_improvements(user_handle))
api_response = make_response(jsonify(response), 200)
else:
response = {'message': 'No such user with the specified handle'}
api_response = make_response(jsonify(response), 404)
api_response.headers['Content-Type'] = 'application/json'
return api_response
Combining all the code, you've a got a fully functioning server serving dynamic APIs.
Similar Reads
Create A Bookmark Organizer For Website Using Flask
This article explains how to make a tool called a Bookmark Organizer with Flask for a website. With this tool, we can add, create, update, delete, and edit bookmarks easily. We just need to input the title and link, and we can save them. Then, we can click on the link anytime to visit the website. W
5 min read
Form Submission API with Swagger Editor and Python Flask
Creating user-friendly APIs is essential for seamless interaction between applications. Leveraging tools like Swagger Editor alongside Python Flask, developers can streamline the process of designing, documenting, and implementing APIs. In this guide, we'll explore how to craft a Form Submission API
3 min read
Using Request Args for a Variable URL in Flask
This article will teach us how to use Request Arguments in our Flask application. First, we will understand. What are the Request Arguments? What are the Request Arguments? Request Arguments are MultiDict objects with the parsed contents of the query string (the part in the URL after the question ma
4 min read
Twitter Sentiment Analysis WebApp using Flask
This is a web app made using Python and Flask Framework. It has a registration system and a dashboard. Users can enter keywords to fetch live tweets from Twitter and analyze them for sentiment. The results are then visualized in a graph. The project uses the popular Tweepy API to connect to X (forme
15+ min read
Using JWT for user authentication in Flask
JWT (JSON Web Token) is a compact, secure, and self-contained token used for securely transmitting information between parties. It is often used for authentication and authorization in web applications. A JWT consists of three parts:Header - Contains metadata (e.g., algorithm used for signing).Paylo
6 min read
How To Use Web Forms in a Flask Application
A web framework called Flask provides modules for making straightforward web applications in Python. It was created using the WSGI tools and the Jinja2 template engine. An example of a micro-framework is Flask. Python web application development follows the WSGI standard, also referred to as web ser
5 min read
Wikipedia search app using Flask Framework - Python
Flask is a micro web framework written in Python. It is classified as a micro-framework because it does not require particular tools or libraries. Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex app
2 min read
How to Get Data from API in Python Flask
In modern web development, APIs (Application Programming Interfaces) play a crucial role in enabling the interaction between different software systems. Flask, a lightweight WSGI web application framework in Python, provides a simple and flexible way to create APIs. In this article, we'll explore ho
2 min read
How to Build a Web App using Flask and SQLite in Python
Flask is a lightweight Python web framework with minimal dependencies. It lets you build applications using Python libraries as needed. In this article, we'll create a Flask app that takes user input through a form and displays it on another page using SQLite.Run the following commands to install Fl
3 min read
Web Scraping using Beautifulsoup and scrapingdog API
In this post we are going to scrape dynamic websites that use JavaScript libraries like React.js, Vue.js, Angular.js, etc you have to put extra efforts. It is an easy but lengthy process if you are going to install all the libraries like Selenium, Puppeteer, and headerless browsers like Phantom.js.
5 min read