How to get data from LinkedIn using Python

Last Updated : 30 Jan, 2023

Linkedin is a professional tool that helps connect people of certain industries together, and jobseekers with recruiters. Overall, it is the need of an hour. Do you have any such requirement in which need to extract data from various LinkedIn profiles? If yes, then you must definitely check this article.

Stepwise Implementation:

Step 1: First of all, import the library's selenium and time. And most importantly we would require a web driver to access the login page and all.

Python3

from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager

Step 2: Now, declare the webdriver and make it headless, i.e., run the web driver in the background. Here we will be using the ChromeOptions feature from the web driver by making it headless().

Python3

options = webdriver.ChromeOptions()
options.add_argument("headless")

exe_path = ChromeDriverManager().install()
service = Service(exe_path)
driver = webdriver.Chrome(service=service, options=options)

Step 3: Then, open the website from which you want to obtain table data and make Python sleep for some time so that the page gets loaded quickly. Sleep is used to stop the program for that much time so, that the website that has been loaded gets loaded completely.

Python3

driver.get("https://www.linkedin.com/login")
sleep(5)

Now let's define a LinkedIn user id and LinkedIn password for that mail id.

Python3

# login credentials
linkedin_username = "#Linkedin Mail I'd"
linkedin_password = "#Linkedin Password"

Step 4: Further, automate entering your Linkedin account email and password and make Python sleep for a few seconds. This step is very crucial because if the login failed anyhow then none of the code after this will work and the desired output will not be shown.

Python3

driver.find_element_by_xpath(
    "/html/body/div/main/div[2]/div[1]/form/div[1]/input").send_keys(linkedin_username)
driver.find_element_by_xpath(
    "/html/body/div/main/div[2]/div[1]/form/div[2]/input").send_keys(linkedin_password)
sleep(  # Number of seconds)
driver.find_element_by_xpath(
    "/html/body/div/main/div[2]/div[1]/form/div[3]/button").click()

Step 5: Now, create a list containing the URL of the profiles from where you want to get data.

Python3

profiles = ['#Linkedin URL-1', '#Linkedin URL-2', '#Linkedin URL-3']

Step 6: Create a loop to open all the URLs and extract the information from them. For this purpose, we will use the driver function which will automate the search work by performing the operations iteratively without any external interference.

Python3

for i in profiles:
    driver.get(i)
    sleep(5)

Step 6.1: Next, obtain the title and description from all the profiles.

Python3

title = driver.find_element_by_xpath(
    "//h1[@class='text-heading-xlarge inline t-24 v-align-middle break-words']").text
print(title)

Step 6.2: Now, obtain the description of each person from the URLs.

Python3

description = driver.find_element_by_xpath(
    "//div[@class='text-body-medium break-words']").text
print(description)
sleep(4)

Step 7: Finally, close the browser.

Python3

driver.close()

Example:

Python3

from time import sleep
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager

options = webdriver.ChromeOptions()
options.add_argument("headless")

exe_path = ChromeDriverManager().install()
service = Service(exe_path)
driver = webdriver.Chrome(service=service, options=options)

driver.get("https://www.linkedin.com/login")
sleep(6)

linkedin_username = "#Linkedin Mail I'd"
linkedin_password = "#Linkedin Password"

driver.find_element(By.XPATH, "/html/body/div/main/div[2]/div[1]/form/div[\
                              1]/input").send_keys(linkedin_username)
driver.find_element(By.XPATH, "/html/body/div/main/div[2]/div[1]/form/div[\
                              2]/input").send_keys(linkedin_password)
sleep(3)
driver.find_element(By.XPATH, "/html/body/div/main/div[2]/div[1]/form/div[\
                              3]/button").click()

profiles = ['https://www.linkedin.com/in/vinayak-rai-a9b231193/',
            'https://www.linkedin.com/in/dishajindgar/',
            'https://www.linkedin.com/in/ishita-rai-28jgj/']

for i in profiles:
    driver.get(i)
    sleep(5)
    title = driver.find_element(By.XPATH,
                                "//h1[@class='text-heading-xlarge inline t-24 v-align-middle break-words']").text
    print(title)
    description = driver.find_element(By.XPATH,
                                      "//div[@class='text-body-medium break-words']").text
    print(description)
    sleep(4)
driver.close()