0% found this document useful (0 votes)

165 views2 pages

Assignment Index Format Guidelines

The document provides instructions for 14 data science assignments involving tasks like data analysis, data visualization, and Python programming. Some of the key tasks mentioned include analyzing a dataset of salaries and tenures to find average salaries for different experience levels, generating random passwords, character frequency analysis of text files, plotting different graphs like scatter plots, bar charts, histograms using sample data provided.

Uploaded by

Chirantan Sahoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

165 views2 pages

Assignment Index Format Guidelines

Uploaded by

Chirantan Sahoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Centre for Data Science

Institute of Technical Education & Research, SOA, Deemed to be University

Introduction to Data Science using Python(CSE 3054)

M INOR A SSIGNMENT-1

1. An anonymous dataset containing each user’s salary (in dollars) and tenure as a data scientist (in
years) is given.

salaries and tenures = [(83000, 8.7), (88000, 8.1), (48000, 0.7),

(76000, 6), (69000, 6.5), (76000, 7.5), (60000, 2.5), (83000, 10),
(48000, 1.9), (63000, 4.2)]

Find out the average salary for each tenure and print a massage according to its value, i.e. ”less than
two”, ”between two and five” and ”more than five” tenure and group together the salaries correspond-
ing to each bucket. Compute the average salary for each group.

2. For the above data there seems to be a correspondence between years of experience and paid accounts
Users with very few and very many years of experience tend to pay; users with average amounts of
experience don’t. Find out the condition for this correspondence and print it.

3. Write a Python Script to generate random passwords (alphanumeric). Ask users to enter the length of
password and number of passwords they want to generate and then save all the generated passwords
as a textfile named “[Link]”.

4. Given a file named “[Link]” containing several lines/paragraph, find all unique characters (ignore
space, comma, full stop, brackets, and quotes etc.) present in the file. Capital and small letter are
counted as same.
Find the frequency (fi) of all characters in the file and print the output as follows.
The character “a” is present times in the document.
The character “t” is present times in the document.

5. Use the above program as a function and use it to write another function to compare contents of two
files “[Link]” and “[Link]”.

a. The output must also give the following information.

File MyText1 contain more (or less or equal) characters than MyText2.
b. The output must be printed in the following format depending on content of the file.
File MyText1 contain more (or less or equal)unique characters than MyText2.
c. The frequency of each characters must be summarised.
The frequency of character of character “x” in file MyText1 ismore (or less or equal)to
characters than MyText2.
d. The relative frequency of each characters also must be summarised.
The relative frequency of character of character “x” in file MyText1 ismore (or less or
equal)to characters than MyText2.

The input files should be nonempty.

6. Read a lists named StringList1 containing strings from the key board. Generate a string MStringList1
that contains all items of StringList1 that are repeated twice or more number of times and print this
list. By observing the outcome of MStringList1 perform the following tasks:

1
Centre for Data Science
Institute of Technical Education & Research, SOA, Deemed to be University

a. Check wather an item of MStringList1 occurs even number of times or odd number of times in
StringList1.
b. Remove the ith (i ≥ 2) occurrence of a given word in a StringList1.

7. From the file ”[Link]” count frequencies of various alphabets (Convert upper case into lower
case), plot the results for this as a bar chart with x-axis being the letter and y-axis as the corresponding
frequency.

8. Use the following data to plot the number of applicant per year as a scatter plot.

year = [2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012]
no application per year = [921261, 929198, 1043739, 1186454,
1194938, 1304495, 1356805, 1282000, 479651]

9. Plot xsinx, x2 sinx , x3 sinx and x4 sinx in a single plot in the range x ∈ [−10, 10].

10. Plot histogram for age of male and female in different plots for the following data of male and female
age.

male age = [53,51,71,31,33,39,52,27,54,30,64,26,21,54,52,20,59,32]

female age = [53,65,68,21,75,46,24,63,61,24,49,41,39,40,25,54,42,
32,48,23,23]

11. Plot the temperature extremes in certain region of India for each month, starting in January, which are
given by (in degrees Celsius).

max: 17, 19, 21, 28, 33, 38, 37, 37, 31, 23, 19, 18
min: -62, -59, -56, -46, -32, -18, -9, -13, -25, -46, -52, -58

12. Python Program to find all Numbers in a Range (given by user) which are Perfect Squares and Sum
of all Digits in the Number is Less than 10.

13. Plot a bar chart with axis labels for given data:

mentions = [500, 505]

years = [2017, 2018]

Do not give any extra condition for x-axis as well as y-axis. Now again plot the bar chart for this data
and start y-axis from 0.
State the difference in both the bar chart.

14. Plot the scatter plot for following data with unequal axis and then equal axis. Also state the difference
in two.

test 1 grades = [ 99, 90, 85, 97, 80]

test 2 grades = [100, 85, 60, 90, 70]

Common questions

First, compile items that appear more than once into a new list, maintaining frequency information. Check if the occurrence count for each item in this list is odd or even. To modify the list, ensure only the second or further occurrence (i.e., i ≥2) of an item is removed without affecting the first occurrence. This requires tracking occurrences using list indexing or a dictionary to maintain the order.

To compare frequencies, write a program that reads both files, convert characters to lowercase, and count character occurrences while ignoring spaces and punctuation. Compare absolute counts to establish which file has more or fewer occurrences for each character. For relative frequency, divide character counts by the total number of characters in each file to compare proportions. Summarize results as 'more', 'less', or 'equal' for both frequency types.

Using unequal axis scales in scatter plots can distort the relationship between test grades, making differences appear more significant than they are. Equal scales ensure accurate plot representation, providing a realistic view of grade correlation. Distorted axes might suggest false trends or outperformances, while equal axes represent actual data relationships, aiding proper interpretation.

When plotting a bar chart, starting the y-axis from a value greater than 0 can exaggerate differences as bars don't linearly represent relative sizes; this may mislead viewers about the magnitude of changes. Starting the y-axis from 0 provides a true scale for comparing data visually, ensuring an accurate representation of proportions and differences without distortion.

To determine the average salary for different experience groups, the dataset should be grouped into three categories based on tenure: 'less than two' years, 'between two and five' years, and 'more than five' years. For each group, filter the salaries and compute the average by summing the salaries within each group and dividing by the number of entries in the group.

According to the data, users with very few (less than two years) and very many (more than five years) years of experience tend to pay for accounts, suggesting that those starting their careers and those with considerable experience value or need paid resources. It appears there is a perceived benefit or necessity for paid accounts at these career stages.

Iterate through the user-defined range and calculate perfect squares. For each perfect square, compute the sum of its digits. Check if the digit sum is less than 10, and if so, include the number in the result list. Utilize mathematical operations and control structures within a loop to efficiently implement this logic.

Scatter plots, with application years on the x-axis and number of applications on the y-axis, show the trend of application numbers over time. This visual format helps identify growth, decline, or variability in applications, with each point representing a specific year's data. Patterns can be recognized as clusters or lines, indicating trends and anomalies.

To generate and store random passwords, use Python's 'random' and 'string' libraries to create alphanumeric passwords. Prompt users for password length and count, generate accordingly, and write each password to a file named 'MyPasswords.txt' using file handling operations to persist the data.

To visualize alphabet frequency, first process the text file to convert all letters to lowercase and count each letter's occurrence while ignoring other characters. Using matplotlib or a similar library, plot these frequencies on a bar chart with letters on the x-axis and their corresponding frequencies on the y-axis, highlighting the distribution visually.

Data Science Lab Record by Saswat Mohanty
No ratings yet
Data Science Lab Record by Saswat Mohanty
47 pages
Salary vs Experience Analysis
No ratings yet
Salary vs Experience Analysis
13 pages
Class 12 CBSE Project Certificate Guide
No ratings yet
Class 12 CBSE Project Certificate Guide
34 pages
Idsup A1
No ratings yet
Idsup A1
17 pages
Python Dictionary and Set Exercises
No ratings yet
Python Dictionary and Set Exercises
3 pages
Data Visualization with Python Syllabus
No ratings yet
Data Visualization with Python Syllabus
20 pages
OPPE 2 Syllabus and Python Tasks
No ratings yet
OPPE 2 Syllabus and Python Tasks
5 pages
Data Visualization Lab Manual Python
No ratings yet
Data Visualization Lab Manual Python
19 pages
Python Data Analytics Lab Programs
No ratings yet
Python Data Analytics Lab Programs
24 pages
Python Programming Assignment 4
No ratings yet
Python Programming Assignment 4
6 pages
Python Programming Exam Questions 2023
No ratings yet
Python Programming Exam Questions 2023
2 pages
Python Functions for Text File Analysis
No ratings yet
Python Functions for Text File Analysis
4 pages
Computer Science Pre-Mid-Term Exam Key
No ratings yet
Computer Science Pre-Mid-Term Exam Key
6 pages
Python File Handling Programs
No ratings yet
Python File Handling Programs
9 pages
Diagonal Sum Difference in NxN Matrix
No ratings yet
Diagonal Sum Difference in NxN Matrix
8 pages
Binary File Operations in Python
No ratings yet
Binary File Operations in Python
2 pages
Username and Password Validation Program
No ratings yet
Username and Password Validation Program
7 pages
Class XII File Handling Test 2024
No ratings yet
Class XII File Handling Test 2024
7 pages
Computer Science Project by Anish Manna
No ratings yet
Computer Science Project by Anish Manna
70 pages
Python Programming Exam Questions
No ratings yet
Python Programming Exam Questions
3 pages
Python Programming Lab Manual
No ratings yet
Python Programming Lab Manual
22 pages
Four Integers Product Problem
No ratings yet
Four Integers Product Problem
12 pages
Python Functions for File Operations
No ratings yet
Python Functions for File Operations
2 pages
Class XII Computer Science Exam Paper 2023-24
100% (4)
Class XII Computer Science Exam Paper 2023-24
11 pages
GT Aloha VidhyaMandir Computer Science Exam
No ratings yet
GT Aloha VidhyaMandir Computer Science Exam
7 pages
AISSCE 2023-24 Computer Science Practical
No ratings yet
AISSCE 2023-24 Computer Science Practical
30 pages
PYTHON (1, 2, 3, 4, 5) Lab Executable Programs
No ratings yet
PYTHON (1, 2, 3, 4, 5) Lab Executable Programs
9 pages
Shreyansh's Computer Science Report
No ratings yet
Shreyansh's Computer Science Report
64 pages
Tips for Python OPPE 2 Preparation
100% (4)
Tips for Python OPPE 2 Preparation
7 pages
Python Data Visualization Lab BCS358D
No ratings yet
Python Data Visualization Lab BCS358D
45 pages
Python Data Visualization Lab Manual
No ratings yet
Python Data Visualization Lab Manual
31 pages
Python File Handling Exercises for Class XII
No ratings yet
Python File Handling Exercises for Class XII
43 pages
Python Programs for Data Analysis and Visualization
No ratings yet
Python Programs for Data Analysis and Visualization
23 pages
Python File Handling Exercises
No ratings yet
Python File Handling Exercises
7 pages
Python Programming Practical File
No ratings yet
Python Programming Practical File
21 pages
Computer Science Pre-Board Exam XII C
No ratings yet
Computer Science Pre-Board Exam XII C
8 pages
Computer Science Exam Paper XII Arts
No ratings yet
Computer Science Exam Paper XII Arts
8 pages
Python Programs for Basic Algorithms
No ratings yet
Python Programs for Basic Algorithms
31 pages
Class 12 Computer Science Practical File
50% (2)
Class 12 Computer Science Practical File
11 pages
Python Projects for Computer Science
No ratings yet
Python Projects for Computer Science
28 pages
Shubha Python 3
No ratings yet
Shubha Python 3
11 pages
Class XII Computer Science Exam Paper
No ratings yet
Class XII Computer Science Exam Paper
7 pages
Python Programming Exercises for XII-A
No ratings yet
Python Programming Exercises for XII-A
45 pages
Python User-Defined Functions Guide
No ratings yet
Python User-Defined Functions Guide
6 pages
Class 12 Cs Practical File Half Yearly
No ratings yet
Class 12 Cs Practical File Half Yearly
44 pages
Grade 10 AI Practical Lab Manual
No ratings yet
Grade 10 AI Practical Lab Manual
6 pages
Class XII Computer Science Exam Paper
No ratings yet
Class XII Computer Science Exam Paper
10 pages
Python Programming Exercises for Class XII
No ratings yet
Python Programming Exercises for Class XII
11 pages
Python Programming Practical File 2024-25
No ratings yet
Python Programming Practical File 2024-25
106 pages
DPS Class XII Computer Science Assignments
No ratings yet
DPS Class XII Computer Science Assignments
43 pages
Essential Python Functions Guide
No ratings yet
Essential Python Functions Guide
61 pages
AISSCE Class XII Computer Science Practical
No ratings yet
AISSCE Class XII Computer Science Practical
3 pages
XII CS MidTerm 2023-24 QP
No ratings yet
XII CS MidTerm 2023-24 QP
5 pages
Computer Science Practical Programs
No ratings yet
Computer Science Practical Programs
62 pages
Python Programming Exam Questions
No ratings yet
Python Programming Exam Questions
2 pages
Computer Science Practical File 2025-26
No ratings yet
Computer Science Practical File 2025-26
35 pages
Data File Handling - Ws 1
No ratings yet
Data File Handling - Ws 1
6 pages
Cse1012 1
No ratings yet
Cse1012 1
24 pages
Understanding Probability Concepts
No ratings yet
Understanding Probability Concepts
20 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
18 pages
Playfair and Monoalphabetic Ciphers Explained
No ratings yet
Playfair and Monoalphabetic Ciphers Explained
19 pages
Global Desi Festive Offer 2019
No ratings yet
Global Desi Festive Offer 2019
2 pages
JEE Main 2020 Syllabus Overview
No ratings yet
JEE Main 2020 Syllabus Overview
10 pages
Global Logistics Directive 2022 - 20250325
No ratings yet
Global Logistics Directive 2022 - 20250325
35 pages
Passive and Active Heating Systems
No ratings yet
Passive and Active Heating Systems
23 pages
Saes A 007
100% (1)
Saes A 007
29 pages
Key Quotes from Othello Explained
100% (1)
Key Quotes from Othello Explained
16 pages
BIM Management Diploma at Kayan Academy
No ratings yet
BIM Management Diploma at Kayan Academy
12 pages
Slickline Vs Digital Slickline
No ratings yet
Slickline Vs Digital Slickline
2 pages
OIML Certificate for DI-5000 Weighing Instrument
No ratings yet
OIML Certificate for DI-5000 Weighing Instrument
2 pages
Light and Wave Patterns in Physics
No ratings yet
Light and Wave Patterns in Physics
102 pages
Gait Analysis of AIS Patients with Braces
No ratings yet
Gait Analysis of AIS Patients with Braces
5 pages
Crafting Effective Research Titles
No ratings yet
Crafting Effective Research Titles
14 pages
Aristel 4x4 PABX User Guide V3.6
No ratings yet
Aristel 4x4 PABX User Guide V3.6
36 pages
Postgraduate Handbook FBME UTM 2016
No ratings yet
Postgraduate Handbook FBME UTM 2016
105 pages
Introduction to Toxicology Basics
100% (2)
Introduction to Toxicology Basics
18 pages
Understanding Running Rigging on Sailboats
No ratings yet
Understanding Running Rigging on Sailboats
14 pages
DFL-WDII HDD Repair & Recovery Guide
100% (1)
DFL-WDII HDD Repair & Recovery Guide
83 pages
TV Watching Habits Survey Results
No ratings yet
TV Watching Habits Survey Results
61 pages
Dnaview 2019 Us
No ratings yet
Dnaview 2019 Us
329 pages
HIMS Officer Vacancy at GREDO
No ratings yet
HIMS Officer Vacancy at GREDO
5 pages
Ranking Intuitionistic Fuzzy Numbers
No ratings yet
Ranking Intuitionistic Fuzzy Numbers
8 pages
Comprehensive Guide to Perioperative Care
No ratings yet
Comprehensive Guide to Perioperative Care
39 pages
Machine Design Problems and Solutions
No ratings yet
Machine Design Problems and Solutions
14 pages
Skills for Successful CIOs
No ratings yet
Skills for Successful CIOs
6 pages
Reflective Practices for Student Teachers
No ratings yet
Reflective Practices for Student Teachers
10 pages
Teacher-Community Engagement in Education
No ratings yet
Teacher-Community Engagement in Education
2 pages
Hospital Pharmacy Classification Overview
No ratings yet
Hospital Pharmacy Classification Overview
82 pages
Engineered by Jacobs. Driven All Over The World.: 50 Years of Engineering Precision
No ratings yet
Engineered by Jacobs. Driven All Over The World.: 50 Years of Engineering Precision
2 pages
Excel Financial Modeling Course Guide
No ratings yet
Excel Financial Modeling Course Guide
12 pages
Corticosteroid Potency Chart Australia
No ratings yet
Corticosteroid Potency Chart Australia
1 page
Degree of Freedom in RV-2AJ Robot Arm
No ratings yet
Degree of Freedom in RV-2AJ Robot Arm
4 pages
DLEAN: Lean Design Proposals for Cisco
No ratings yet
DLEAN: Lean Design Proposals for Cisco
62 pages

Assignment Index Format Guidelines

Uploaded by

Assignment Index Format Guidelines

Uploaded by

Centre for Data Science

Institute of Technical Education & Research, SOA, Deemed to be University

Introduction to Data Science using Python(CSE 3054)

salaries and tenures = [(83000, 8.7), (88000, 8.1), (48000, 0.7),

a. The output must also give the following information.

The input files should be nonempty.

male age = [53,51,71,31,33,39,52,27,54,30,64,26,21,54,52,20,59,32]

mentions = [500, 505]

test 1 grades = [ 99, 90, 85, 97, 80]

Common questions

Discuss the steps needed to analyze repeated items from a list and modify them according to specific occurrences.

How can frequency and relative frequency of characters from a text file be compared between two files?

Explain how varying axis parameters affect the interpretation of scatter plot data using test grades.

How does initializing a bar chart with different y-axis settings impact the visualization of the given data?

What is the method to determine the average salary for different experience groups based on the given dataset?

In what way do years of experience correlate with paid accounts according to the provided data?

What approach would you use to determine perfect square numbers within a user-defined range where the digit sum is less than 10?

How can scatter plots be used to examine trends in application data over the years?

What technique would you use to generate and persistently store random passwords using Python?

How can one visualize the frequency of alphabets from a text file using a bar chart in Python?

You might also like