0% found this document useful (0 votes)

4 views

pandas-3

The document provides a tutorial on using the pandas library in Python, including data import, manipulation, and conversion techniques. It demonstrates how to handle null values, apply functions, and sort datasets, specifically using an automobile dataset. Additionally, it covers data type conversions and filtering methods to extract specific information from the dataset.

Uploaded by

praveen838307

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

pandas-3

Uploaded by

praveen838307

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

pandas-3

April 21, 2025

[1]: import pandas as pd

df1 = pd.read_csv('Auto.csv')
df1.head()

[1]: mpg cylinders displacement Horse Power weight acceleration year \

0 18.0 8.0 307.0 130 3504 12.0 70
1 15.0 8.0 350.0 165 3693 11.5 70
2 NaN 8.0 318.0 150 3436 11.0 70
3 NaN 8.0 NaN 150 3433 12.0 70
4 NaN 8.0 NaN 140 3449 10.5 70

origin name
0 1 chevrolet chevelle malibu
1 1 buick skylark 320
2 1 plymouth satellite
3 1 amc rebel sst
4 1 ford torino

[ ]: # soft conversion --> wherever the conversion is possible

# bool --> int --> float --> complex --> strings ( strings is at highest level)

[3]: int(True)

[3]: 1

[4]: float(1)

[4]: 1.0

[5]: complex(1.0)

[5]: (1+0j)

[6]: str(1+0j)

[6]: '(1+0j)'

[ ]: # bool --> int --> float --> complex --> strings ( strings is at highest level)

1
[8]: pd.Series([1,2,3,4,5,1.4]) # as Series can take one single data type , it will␣
↪assign data type based on hierarchy

[8]: 0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 1.4
dtype: float64

[ ]: # Null values : 1. machine error machine was not able to capture this␣
↪information

# 2. human error : people didnt entered the data

[19]: int(2)*2

[19]: 4

[20]: int('2') *2

[20]: 4

[21]: '2'*2 # it repeats the string twice

[21]: '22'

[9]: pd.Series([1,2,3,4,5,'abc']) # strings are the highest# it will convert␣

↪everything to string

[9]: 0 1
1 2
2 3
3 4
4 5
5 abc
dtype: object

[11]: 1*2

[11]: 2

[16]: int('1') * 2

[16]: 2

[22]: 'a' *2 # string repeats those many times

2
[22]: 'aa'

[23]: '2' *2

[23]: '22'

[ ]: import os
os.getcwd() # get current working directory

os.chdir('') # mention the path to new folder

[ ]: a function which was created using def block can be used infinite times across␣
↪the python code

if you know the function that you need to use is not used more than once -->␣
↪lambda ( if function logic is simple )

[24]: df1['col1'] = df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else 'Odd')

[25]: df1.head()

[25]: mpg cylinders displacement Horse Power weight acceleration year \

0 18.0 8.0 307.0 130 3504 12.0 70
1 15.0 8.0 350.0 165 3693 11.5 70
2 NaN 8.0 318.0 150 3436 11.0 70
3 NaN 8.0 NaN 150 3433 12.0 70
4 NaN 8.0 NaN 140 3449 10.5 70

origin name col1

0 1 chevrolet chevelle malibu Even
1 1 buick skylark 320 Odd
2 1 plymouth satellite Odd
3 1 amc rebel sst Even
4 1 ford torino Odd

[ ]: inplace = True --> changes will be committed to df1 not to your excel file

[27]: pwd

[27]: 'C:\\Users\\admin\\2802'

[26]: df1.to_excel('auto_updated.xlsx', index = False)

[ ]: # string in characters to integer conversion is not possible

[17]: int('abc') # errors = 'coerce' # if conversioin is not possible return null␣

↪value there

3
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[17], line 1
----> 1 int('abc')

ValueError: invalid literal for int() with base 10: 'abc'

[18]: help(pd.to_numeric)

Help on function to_numeric in module pandas.core.tools.numeric:

to_numeric(arg, errors: 'DateTimeErrorChoices' = 'raise', downcast:

"Literal['integer', 'signed', 'unsigned', 'float'] | None" = None,
dtype_backend: 'DtypeBackend | lib.NoDefault' = <no_default>)
Convert argument to a numeric type.

The default return dtype is `float64` or `int64`

depending on the data supplied. Use the `downcast` parameter
to obtain other dtypes.

Please note that precision loss may occur if really large numbers
are passed in. Due to the internal limitations of `ndarray`, if
numbers smaller than `-9223372036854775808` (np.iinfo(np.int64).min)
or larger than `18446744073709551615` (np.iinfo(np.uint64).max) are
passed in, it is very likely they will be converted to float so that
they can be stored in an `ndarray`. These warnings apply similarly to
`Series` since it internally leverages `ndarray`.

Parameters
----------
arg : scalar, list, tuple, 1-d array, or Series
Argument to be converted.
errors : {'ignore', 'raise', 'coerce'}, default 'raise'
- If 'raise', then invalid parsing will raise an exception.
- If 'coerce', then invalid parsing will be set as NaN.
- If 'ignore', then invalid parsing will return the input.

.. versionchanged:: 2.2

"ignore" is deprecated. Catch exceptions explicitly instead.

downcast : str, default None

Can be 'integer', 'signed', 'unsigned', or 'float'.
If not None, and if the data has been successfully cast to a
numerical dtype (or if the data was numeric to begin with),
downcast that resulting data to the smallest numerical dtype

4
possible according to the following rules:

- 'integer' or 'signed': smallest signed int dtype (min.: np.int8)

- 'unsigned': smallest unsigned int dtype (min.: np.uint8)
- 'float': smallest float dtype (min.: np.float32)

As this behaviour is separate from the core conversion to

numeric values, any errors raised during the downcasting
will be surfaced regardless of the value of the 'errors' input.

In addition, downcasting will only occur if the size

of the resulting data's dtype is strictly larger than
the dtype it is to be cast to, so if none of the dtypes
checked satisfy that specification, no downcasting will be
performed on the data.
dtype_backend : {'numpy_nullable', 'pyarrow'}, default 'numpy_nullable'
Back-end data type applied to the resultant :class:`DataFrame`
(still experimental). Behaviour is as follows:

* ``"numpy_nullable"``: returns nullable-dtype-backed :class:`DataFrame`

(default).
* ``"pyarrow"``: returns pyarrow-backed nullable :class:`ArrowDtype`
DataFrame.

.. versionadded:: 2.0

Returns
-------
ret
Numeric if parsing succeeded.
Return type depends on input. Series if Series, otherwise ndarray.

See Also
--------
DataFrame.astype : Cast argument to a specified dtype.
to_datetime : Convert argument to datetime.
to_timedelta : Convert argument to timedelta.
numpy.ndarray.astype : Cast a numpy array to a specified type.
DataFrame.convert_dtypes : Convert dtypes.

Examples
--------
Take separate series and convert to numeric, coercing when told to

>>> s = pd.Series(['1.0', '2', -3])

>>> pd.to_numeric(s)
0 1.0
1 2.0

5
2 -3.0
dtype: float64
>>> pd.to_numeric(s, downcast='float')
0 1.0
1 2.0
2 -3.0
dtype: float32
>>> pd.to_numeric(s, downcast='signed')
0 1
1 2
2 -3
dtype: int8
>>> s = pd.Series(['apple', '1.0', '2', -3])
>>> pd.to_numeric(s, errors='coerce')
0 NaN
1 1.0
2 2.0
3 -3.0
dtype: float64

Downcasting of nullable integer and floating dtypes is supported:

>>> s = pd.Series([1, 2, 3], dtype="Int64")

>>> pd.to_numeric(s, downcast="integer")
0 1
1 2
2 3
dtype: Int8
>>> s = pd.Series([1.0, 2.1, 3.0], dtype="Float64")
>>> pd.to_numeric(s, downcast="float")
0 1.0
1 2.1
2 3.0
dtype: Float32

[ ]: # when we execute any command in pandas it returns the data

# if you save this output --> then its not displayed

[31]: df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else 'Odd')

[31]: 0 Even
1 Odd
2 Odd
3 Even
4 Odd
…

6
392 Odd
393 Odd
394 Odd
395 Odd
396 Odd
Name: acceleration, Length: 397, dtype: object

[32]: df1['col1'] = df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else␣

↪'Odd')

[33]: df1.head(2)

[33]: mpg cylinders displacement Horse Power weight acceleration year \

0 18.0 8.0 307.0 130 3504 12.0 70
1 15.0 8.0 350.0 165 3693 11.5 70

origin name col1

0 1 chevrolet chevelle malibu Even
1 1 buick skylark 320 Odd

[ ]: ['col1','col2']

[29]: # Sorting the data set

[30]: df1.sort_values(by = 'acceleration', ascending=False)

# ascending = False --> descending order
# ascending = True --> ascending order ( by default )

[30]: mpg cylinders displacement Horse Power weight acceleration year \

299 27.2 4.0 141.0 71 3190 24.8 79
393 44.0 4.0 97.0 52 2130 24.6 82
326 43.4 4.0 90.0 48 2335 23.7 80
59 23.0 4.0 97.0 54 2254 23.5 72
195 29.0 4.0 85.0 52 2035 22.2 76
.. … … … … … … …
12 15.0 NaN 400.0 150 3761 9.5 70
6 NaN 8.0 NaN 220 4354 9.0 70
7 14.0 8.0 NaN 215 4312 8.5 70
9 15.0 8.0 390.0 190 3850 8.5 70
11 14.0 NaN 340.0 160 3609 8.0 70

origin name col1

299 2 peugeot 504 Odd
393 2 vw pickup Odd
326 2 vw dasher (diesel) Odd
59 2 volkswagen type 3 Odd
195 1 chevrolet chevette Odd

7
.. … … …
12 1 chevrolet monte carlo Odd
6 1 chevrolet impala Odd
7 1 plymouth fury iii Odd
9 1 amc ambassador dpl Odd
11 1 plymouth 'cuda 340 Even

[397 rows x 10 columns]

[ ]: # df1.sort_values(by = ['acceleration','weight'], ascending=[False,True])

How to filter the dataset

[34]: df1.shape

[34]: (397, 10)

[ ]: # extract the rows where mpg value is greater than 20

[38]: df1['mpg']

[38]: 0 18.0
1 15.0
2 NaN
3 NaN
4 NaN
…
392 27.0
393 44.0
394 32.0
395 28.0
396 31.0
Name: mpg, Length: 397, dtype: float64

[40]: cond1 = df1['mpg'] > 20

df1[['mpg','weight','acceleration']][cond1] # it will return the rows where␣
↪cond1 is set to True

[40]: mpg weight acceleration

14 24.0 2372 15.0
15 22.0 2833 15.5
17 21.0 2587 16.0
18 27.0 2130 14.5
19 26.0 1835 20.5
.. … … …
392 27.0 2790 15.6
393 44.0 2130 24.6

8
394 32.0 2295 11.6
395 28.0 2625 18.6
396 31.0 2720 19.4

[237 rows x 3 columns]

[ ]: df1['col1'] = df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else 'Odd')

[ ]: & --> and

| --> Or
~ --> negation

[41]: cond1 = (df1['mpg'] > 20) & (df1['col1'] == 'Even')

df1[cond1] # it will return the rows where cond1 is set to True

[41]: mpg cylinders displacement Horse Power weight acceleration year \

17 21.0 6.0 200.0 85 2587 16.0 70
31 25.0 4.0 113.0 95 2228 14.0 71
49 23.0 4.0 122.0 86 2220 14.0 71
50 28.0 4.0 116.0 90 2123 14.0 71
54 35.0 4.0 72.0 69 1613 18.0 71
77 22.0 4.0 121.0 76 2511 18.0 72
79 26.0 4.0 96.0 69 2189 18.0 72
80 22.0 4.0 122.0 86 2395 16.0 72
101 23.0 6.0 198.0 95 2904 16.0 73
113 21.0 6.0 155.0 107 2472 14.0 73
122 24.0 4.0 121.0 110 2660 14.0 73
148 26.0 4.0 116.0 75 2246 14.0 74
151 31.0 4.0 79.0 67 2000 16.0 74
167 29.0 4.0 97.0 75 2171 16.0 75
175 29.0 4.0 90.0 70 1937 14.0 75
234 24.5 4.0 151.0 88 2740 16.0 77
293 31.9 4.0 89.0 71 1925 14.0 79
305 28.4 4.0 151.0 90 2670 16.0 79
331 33.8 4.0 97.0 67 2145 18.0 80
349 34.1 4.0 91.0 68 1985 16.0 81
369 34.0 4.0 112.0 88 2395 18.0 82
371 29.0 4.0 135.0 84 2525 16.0 82
372 27.0 4.0 151.0 90 2735 18.0 82

origin name col1

17 1 ford maverick Even
31 3 toyota corona Even
49 1 mercury capri 2000 Even
50 2 opel 1900 Even
54 3 datsun 1200 Even
77 2 volkswagen 411 (sw) Even

9
79 2 renault 12 (sw) Even
80 1 ford pinto (sw) Even
101 1 plymouth duster Even
113 1 mercury capri v6 Even
122 2 saab 99le Even
148 2 fiat 124 tc Even
151 2 fiat x1.9 Even
167 3 toyota corolla Even
175 2 volkswagen rabbit Even
234 1 pontiac sunbird coupe Even
293 2 vw rabbit custom Even
305 1 buick skylark limited Even
331 3 subaru dl Even
349 3 mazda glc 4 Even
369 1 chevrolet cavalier 2-door Even
371 1 dodge aries se Even
372 1 pontiac phoenix Even

[42]: df1[(df1['mpg'] > 20) & (df1['col1'] == 'Even')] # it will return the rows␣
↪where cond1 is set to True

[42]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

10
17 1 ford maverick Even
31 3 toyota corona Even
49 1 mercury capri 2000 Even
50 2 opel 1900 Even
54 3 datsun 1200 Even
77 2 volkswagen 411 (sw) Even
79 2 renault 12 (sw) Even
80 1 ford pinto (sw) Even
101 1 plymouth duster Even
113 1 mercury capri v6 Even
122 2 saab 99le Even
148 2 fiat 124 tc Even
151 2 fiat x1.9 Even
167 3 toyota corolla Even
175 2 volkswagen rabbit Even
234 1 pontiac sunbird coupe Even
293 2 vw rabbit custom Even
305 1 buick skylark limited Even
331 3 subaru dl Even
349 3 mazda glc 4 Even
369 1 chevrolet cavalier 2-door Even
371 1 dodge aries se Even
372 1 pontiac phoenix Even

[43]: df1[pd.isna(df1['mpg'])]

[43]: mpg cylinders displacement Horse Power weight acceleration year \

2 NaN 8.0 318.0 150 3436 11.0 70
3 NaN 8.0 NaN 150 3433 12.0 70
4 NaN 8.0 NaN 140 3449 10.5 70
5 NaN 8.0 NaN 198 4341 10.0 70
6 NaN 8.0 NaN 220 4354 9.0 70

origin name col1

2 1 plymouth satellite Odd
3 1 amc rebel sst Even
4 1 ford torino Odd
5 1 ford galaxie 500 Even
6 1 chevrolet impala Odd

[44]: df1[~(pd.isna(df1['mpg']))] # negation return opposite answer --> it return␣

↪data which is not null

[44]: mpg cylinders displacement Horse Power weight acceleration year \

0 18.0 8.0 307.0 130 3504 12.0 70
1 15.0 8.0 350.0 165 3693 11.5 70
7 14.0 8.0 NaN 215 4312 8.5 70

11
8 14.0 8.0 455.0 225 4425 10.0 70
9 15.0 8.0 390.0 190 3850 8.5 70
.. … … … … … … …
392 27.0 4.0 140.0 86 2790 15.6 82
393 44.0 4.0 97.0 52 2130 24.6 82
394 32.0 4.0 135.0 84 2295 11.6 82
395 28.0 4.0 120.0 79 2625 18.6 82
396 31.0 4.0 119.0 82 2720 19.4 82

origin name col1

0 1 chevrolet chevelle malibu Even
1 1 buick skylark 320 Odd
7 1 plymouth fury iii Odd
8 1 pontiac catalina Even
9 1 amc ambassador dpl Odd
.. … … …
392 1 ford mustang gl Odd
393 2 vw pickup Odd
394 1 dodge rampage Odd
395 1 ford ranger Odd
396 1 chevy s-10 Odd

[392 rows x 10 columns]

DBMS Lab File
No ratings yet
DBMS Lab File
43 pages
Item Master Data Tutorial
100% (2)
Item Master Data Tutorial
39 pages
Syllabus CSEN1111-OOP With Java
No ratings yet
Syllabus CSEN1111-OOP With Java
7 pages
Pandas Worksheets ALL
100% (1)
Pandas Worksheets ALL
8 pages
CSE1002 Problem Solving With Object Oriented Programming LO 1 AC39
No ratings yet
CSE1002 Problem Solving With Object Oriented Programming LO 1 AC39
7 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Change The Data Type of Columns in Pandas - LinkedIn
No ratings yet
Change The Data Type of Columns in Pandas - LinkedIn
1 page
P#04 ML 46
No ratings yet
P#04 ML 46
11 pages
Pandas 2.0 - A Game-Changer For Data Scientists - Towards Data Science
No ratings yet
Pandas 2.0 - A Game-Changer For Data Scientists - Towards Data Science
18 pages
Read CSV Files Using Pandas Library
No ratings yet
Read CSV Files Using Pandas Library
11 pages
Pandas Assignment 1
No ratings yet
Pandas Assignment 1
3 pages
Pandas PDF
No ratings yet
Pandas PDF
3,021 pages
Data Wrangling PDF
No ratings yet
Data Wrangling PDF
14 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Pandas
No ratings yet
Pandas
42 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
10 Minutes To Pandas - Pandas 1.2.4 Documentation
No ratings yet
10 Minutes To Pandas - Pandas 1.2.4 Documentation
18 pages
Python - How Do I Find Numeric Columns in Pandas - Stack Overflow
No ratings yet
Python - How Do I Find Numeric Columns in Pandas - Stack Overflow
6 pages
Renaming Fields - VTT
No ratings yet
Renaming Fields - VTT
3 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Project 8 Predictive Analytics - Ipynb - Colaboratory
No ratings yet
Project 8 Predictive Analytics - Ipynb - Colaboratory
8 pages
Tutorial Data Visualization Pandas Matplotlib Seaborn
No ratings yet
Tutorial Data Visualization Pandas Matplotlib Seaborn
32 pages
week 3 python (1)
No ratings yet
week 3 python (1)
152 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Handling Missing Data _ Python Data Science Handbook
No ratings yet
Handling Missing Data _ Python Data Science Handbook
9 pages
Pandas Summarized Visually in 8
100% (2)
Pandas Summarized Visually in 8
8 pages
CH-6 Data Loading, Storage, and File Formats
No ratings yet
CH-6 Data Loading, Storage, and File Formats
163 pages
Lec3_PandasDataframes_2
No ratings yet
Lec3_PandasDataframes_2
16 pages
Data Analysis: Data Preparation
No ratings yet
Data Analysis: Data Preparation
9 pages
Ddos Dataset: Import As Import As Import As Import As From Import
No ratings yet
Ddos Dataset: Import As Import As Import As Import As From Import
51 pages
Introductiontocourse: 1 The Python Programming Language: Functions
No ratings yet
Introductiontocourse: 1 The Python Programming Language: Functions
11 pages
Data Frame 100 Questions
No ratings yet
Data Frame 100 Questions
16 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
DA0101EN-Review-Introduction - Jupyter Notebook
No ratings yet
DA0101EN-Review-Introduction - Jupyter Notebook
8 pages
Dsbda Ass2
No ratings yet
Dsbda Ass2
49 pages
MOD-3 Dap
No ratings yet
MOD-3 Dap
41 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Pandas Cheatsheets 1.0.6 Web Binder PDF
No ratings yet
Pandas Cheatsheets 1.0.6 Web Binder PDF
8 pages
More On Pandas
No ratings yet
More On Pandas
51 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet
83% (12)
Pandas Cheat Sheet
2 pages
Handout Pandas
No ratings yet
Handout Pandas
33 pages
Data Types in Pandas by Jaume Boguñá
No ratings yet
Data Types in Pandas by Jaume Boguñá
17 pages
Pandas Course Slides
No ratings yet
Pandas Course Slides
90 pages
pandas
No ratings yet
pandas
24 pages
Python Pandas
No ratings yet
Python Pandas
21 pages
practice py
No ratings yet
practice py
8 pages
Python Basic and Advanced-Day 8
100% (1)
Python Basic and Advanced-Day 8
20 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Pandas
No ratings yet
Pandas
41 pages
Accelerated Data Science Getting Started Cheat Sheet Cudf 2003937 r4
No ratings yet
Accelerated Data Science Getting Started Cheat Sheet Cudf 2003937 r4
2 pages
Ai Workflow Data Preparation With Numpy and Pandas: MR Hew Ka Kian Hew - Ka - Kian@Rp - Edu.Sg
No ratings yet
Ai Workflow Data Preparation With Numpy and Pandas: MR Hew Ka Kian Hew - Ka - Kian@Rp - Edu.Sg
26 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Gd Script
From Everand
Gd Script
Marijo Trkulja
No ratings yet
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Javascript - Domain Fundamentals Assignments
No ratings yet
Javascript - Domain Fundamentals Assignments
77 pages
22.java 8 New Features
No ratings yet
22.java 8 New Features
9 pages
Java Script
No ratings yet
Java Script
30 pages
Ques Python
No ratings yet
Ques Python
30 pages
Rtdemo 1
No ratings yet
Rtdemo 1
27 pages
Ediabas Best Guide
No ratings yet
Ediabas Best Guide
161 pages
Resilient IRP Function Developer Guide
No ratings yet
Resilient IRP Function Developer Guide
30 pages
Sap Abap Sample Report
No ratings yet
Sap Abap Sample Report
98 pages
6es5 998-0ub23
No ratings yet
6es5 998-0ub23
550 pages
How To Integrate With Smart IPC LPR (Main Function)
No ratings yet
How To Integrate With Smart IPC LPR (Main Function)
4 pages
Structures: Short Answer
No ratings yet
Structures: Short Answer
11 pages
Oracle FDMEE Open Batch For Single Period
No ratings yet
Oracle FDMEE Open Batch For Single Period
17 pages
D485 Man PDF
No ratings yet
D485 Man PDF
96 pages
MCQ Methods
No ratings yet
MCQ Methods
2 pages
Xi3 Error Message Guide en
No ratings yet
Xi3 Error Message Guide en
502 pages
TypeScript Cheatsheet Zero To Mastery V1.01
No ratings yet
TypeScript Cheatsheet Zero To Mastery V1.01
17 pages
JBASE Files
100% (2)
JBASE Files
53 pages
CATT Procedure
No ratings yet
CATT Procedure
1 page
Procedure in PL/SQL Example: 1
No ratings yet
Procedure in PL/SQL Example: 1
9 pages
Multiple Choice Questions
100% (4)
Multiple Choice Questions
44 pages
SPL (Sanctioned Party List)
No ratings yet
SPL (Sanctioned Party List)
2 pages
Python Programming: Time: 2 HRS.) (Marks: 75
No ratings yet
Python Programming: Time: 2 HRS.) (Marks: 75
18 pages
Iti Ic GB
No ratings yet
Iti Ic GB
72 pages
Python 2
No ratings yet
Python 2
23 pages
A C++ Crash Course: UW Association For Computing Machinery
No ratings yet
A C++ Crash Course: UW Association For Computing Machinery
57 pages

pandas-3

Uploaded by

pandas-3

Uploaded by

pandas-3

April 21, 2025

[1]: import pandas as pd

[1]: mpg cylinders displacement Horse Power weight acceleration year \

[ ]: # soft conversion --> wherever the conversion is possible

# 2. human error : people didnt entered the data

[21]: '2'*2 # it repeats the string twice

[9]: pd.Series([1,2,3,4,5,'abc']) # strings are the highest# it will convert␣

[22]: 'a' *2 # string repeats those many times

os.chdir('') # mention the path to new folder

[24]: df1['col1'] = df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else 'Odd')

[25]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[26]: df1.to_excel('auto_updated.xlsx', index = False)

[ ]: # string in characters to integer conversion is not possible

[17]: int('abc') # errors = 'coerce' # if conversioin is not possible return null␣

ValueError: invalid literal for int() with base 10: 'abc'

Help on function to_numeric in module pandas.core.tools.numeric:

to_numeric(arg, errors: 'DateTimeErrorChoices' = 'raise', downcast:

The default return dtype is `float64` or `int64`

"ignore" is deprecated. Catch exceptions explicitly instead.

downcast : str, default None

- 'integer' or 'signed': smallest signed int dtype (min.: np.int8)

As this behaviour is separate from the core conversion to

In addition, downcasting will only occur if the size

* ``"numpy_nullable"``: returns nullable-dtype-backed :class:`DataFrame`

>>> s = pd.Series(['1.0', '2', -3])

Downcasting of nullable integer and floating dtypes is supported:

>>> s = pd.Series([1, 2, 3], dtype="Int64")

[ ]: # when we execute any command in pandas it returns the data

[31]: df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else 'Odd')

[32]: df1['col1'] = df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else␣

[33]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[29]: # Sorting the data set

[30]: df1.sort_values(by = 'acceleration', ascending=False)

[30]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[397 rows x 10 columns]

[ ]: # df1.sort_values(by = ['acceleration','weight'], ascending=[False,True])

How to filter the dataset

[34]: (397, 10)

[ ]: # extract the rows where mpg value is greater than 20

[40]: cond1 = df1['mpg'] > 20

[40]: mpg weight acceleration

[237 rows x 3 columns]

[ ]: df1['col1'] = df1['acceleration'].apply(lambda x: 'Even' if x%2 == 0 else 'Odd')

[ ]: & --> and

[41]: cond1 = (df1['mpg'] > 20) & (df1['col1'] == 'Even')

[41]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[42]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[43]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[44]: df1[~(pd.isna(df1['mpg']))] # negation return opposite answer --> it return␣

[44]: mpg cylinders displacement Horse Power weight acceleration year \

origin name col1

[392 rows x 10 columns]

You might also like