0% found this document useful (0 votes)
2 views

Assignment_1

The assignment requires students to create a Python script that scrapes IMDb for specific movie details, including the poster, video trailer, storyline, user reviews, genre, director, cast, and IMDb rating. Students must submit a zip file containing the script and a requirements file by January 24, 2025, and ensure the script handles errors and outputs information in a readable format. The assignment is part of the CS6224 course and is worth a total of 20 marks, with specific mark distribution for each task.

Uploaded by

Rohit Kamble
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Assignment_1

The assignment requires students to create a Python script that scrapes IMDb for specific movie details, including the poster, video trailer, storyline, user reviews, genre, director, cast, and IMDb rating. Students must submit a zip file containing the script and a requirements file by January 24, 2025, and ensure the script handles errors and outputs information in a readable format. The assignment is part of the CS6224 course and is worth a total of 20 marks, with specific mark distribution for each task.

Uploaded by

Rohit Kamble
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Assignment 1: Scraping IMDb for Movie Details

Course Code: CS6224 (Text Mining and Analytics)​


Total Marks: 20​
Submission Format: A zip file named roll_number_assign1.zip containing the following:

Deadline: 24/1/25

1.​ A Python script (main.py).


2.​ A requirements.txt file for dependency installation.

Objective:

To scrape IMDb for specific movie details using its URL, provided by the user, and extract the
following information:

1.​ Poster
2.​ Video Trailer
3.​ Storyline
4.​ Five User Reviews
5.​ Genre
6.​ Director
7.​ Cast
8.​ IMDb Rating

Assignment Guidelines:

1.​ Environment Setup:​

○​ Create a Python virtual environment.


○​ Install necessary libraries using pip (e.g., beautifulsoup4, requests,
lxml).
2.​ Functional Requirements:​

○​ The Python script should accept an IMDb movie URL as input.


○​ Extract the specified details from the webpage.
○​ Handle errors gracefully (e.g., invalid URLs, missing data).
○​ Output the extracted information in a readable format.
3.​ Mark Distribution:​

Task Marks

Extract Poster 5

Extract Video Trailer 5

Extract Storyline 2

Extract Five User Reviews 5

Extract Genre, Director, and 2


Cast

Extract IMDb Rating 1

4.​ ​
Submission Instructions:​

○​ Zip your main.py and requirements.txt files into a single archive named
roll_number_assign1.zip.
○​ Ensure the script is well-documented and follows best coding practices.
5.​ Execution:​

○​ To run the script, users should:


1.​ Install dependencies using pip install -r requirements.txt.
2.​ Execute the script to scrape IMDb details for the provided URL.

Note:

●​ Ensure the script works for multiple IMDb URLs.


●​ Handle edge cases where certain details might be missing on the IMDb page.

Good Luck!

Submit your zip file here:

https://www.dropbox.com/request/IlauDoTcuR9Czk8pVaBf

You might also like