Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
27 views
Exercise Data Analysis
Exercise Data Analysis
Uploaded by
thaihaidang
AI-enhanced title
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save Exercise-Data-Analysis For Later
Download
Save
Save Exercise-Data-Analysis For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
27 views
Exercise Data Analysis
Exercise Data Analysis
Uploaded by
thaihaidang
AI-enhanced title
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save Exercise-Data-Analysis For Later
Carousel Previous
Carousel Next
Save
Save Exercise-Data-Analysis For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 25
Search
Fullscreen
AI VIET NAM — COURSE 2024 Data Analysis - Exercise Nefy 17 thang 8 nam 2024 Phan I: Ly thuyét Pandas li mot thet vien trong Python vai ttn diém li nhanb, monk, link dong, a8 sit dng, ma nguém md, cong eu dug dé phan tieh va thao tée dit lien, Pandas duge xay dumg tren tht vien NumPy va u functions hé tro cleaning, analyzing, vA manipulating data, e6 thé ghip ta extract valuable insights cia ede tap att igu, Pandas rat hieu qu khi sit dung tren dif lign bang, nhit SQL table hose 6 nh Excel spreadsheets, ul pandas nb 1: Logo tht vign Pandas Mot s6 die diém efia Pandas: ‘# Thao the wi ede ngubn dit lieu ti file sv, excel file, SQL, JSON file + Chung efip cic loni cd trie dit ligu khie nhw nbvt Series, DataFrame vi Panel 1 Co thé dap ting uhidn dang dataset Khe nhau nbut time series, heterogeneous data, tabular vi matrix data + Co thé lam vige vii missing data bing efich xéa ehiing hose gin cho chiing gi tri zeros hofe gid ‘ri phit hop voi trang thai test. ‘© C6 thé dling cho vige parsing vi conversion data. + Ching edp cdo Ky thmat loo dit lieu. '* Cung cp time series functionality ~ date range generation, frequency conversion, moving window statistics, data shifting va lagging,AL VIETNAM aivietnam.edu.vn + Tich hop t6t vii ede thut vign khée eiia Python nhit Seikit-learn, statmodels va SciPy. © C6 hieu nang ea. Cau trie dit ligu trong Pandas: Pandas duge xiy ding tren NumPy array, bao gdm Series, DataFrame vi Panel: # Series: C6 cfu tric la ming ID voi dtt ligu dng nit, Joni dit ligu 66 thé integer, string, float,» true dinh nhin dite goi I chi mue (index). Kich thude cia series Ia Khong thé thay 446i (immutable) va gid ti dit lign e6 thé thay déi (mutable). Dé khdi tao Series e6 thé ding pandas.Series(data, index, dtype, eopy), trong dé: ~ data: Nhan céc gid tri e6 dang ndatray, list, dictionary, constant, — index: Gia tri index phai la duy nit (unique), e6 thé hash va ¢6 kich thude bing data, me dinh index e6 gi tri 0, 1, 2 — dtype: Logi dif ligu ciia gid tri ben trong Series. Chi mye Dir liu oof Hinh 2: Vidu vé mot Sevies trong Pandas * DataFrame: La ein trie dit ligu 2D, e6 dang bang bao gém eae est va hing, ese cot e6 thé dink ghia logi dit lien khie nhan, Cae edt 66 ee kiém dit lieu khae nhan nhit float64, int, bool,.. Mot Ot ciia DataFrame I mét cu triic Series. Cac chién DataFrame ditge dnh nhan theo eae hing vi cot. Tit d6, ta ¢6 thé thao tae tren ed hing va cot, Dé khdi tao DataFrame, 6 thé thite hign bdi pandas.DataFrame(data, index, columns, dtype, copy). = data: Nhan cfc gid tri nhuf ndarray, series, map, lists, diet, constants va DataFrame khae: ~ Che tham s6 khée tnléng tif uhtt Series, pandas DataFrame e6 thé duige tao ding eée input nt Lists, Diet, Series, Numpy ndarrays, DataFrame khée.AL VIETNAM aivietnam.edu.vn orm oc oo ° A o M ° A M + — 1 8 a F 1 5 F — 2 c 2 F 2 c F 3 . 3 M 3 > M Series Series DataFrame Hinh 3: Vi dy vé DataFrame trong Pandas, C6 thé coi DataFrame la mgt danh sich ehita cde Series, * Panel: La mot 3D container, trong d6: ~ items: axis 0, méi item titimg ting DataFrame ehtta ben trong, — major _axis: axis 1, n6 1a cae hang (rows) ciia méi DataFrame. — minor_axis: axis 2, n6 1a cae e6t (columns) efia mdi DataFrame tems EES ES TE] 0 A uM (iets eas] 5 E Mog Cae a SS 9 1 F F ‘Major 10 J M F Axis Ww K F 2 L F Hinh 4: Vidy vé Panel trong Pandas, C6 thé coi Panel la mot danh sich ebtta eée DataFrame, Mot sé function trén Pandas thuding dimg dé xit ly dit lieu: * Handle missing values: isna(), notna() ~ tim kiém eae gi tri NA, isnulll) # Indexing and slicing in Pandas: loc (label based). .iloe (integer based), .ix (label and integer based). + Cac query nlut trong excel hay SQL: where(), query()AL VIETNAM aivietnam.edu.vn ‘* Sort: sort_indlex(), sort_values() ‘¢ Series basic funetionality: axes, dtype, empty, ndim, size, values, head(), tail(), ze, values, hend(), tail() ‘* Dataframe basic functionality: T, axes, dtypes, empty, ndim, shape, ‘+ Céc function lign quan théng ké: count(), sum(), mean(), median(), model(), std(), min() max(), abs(), prod(), eumsum(), cumprod(), describe(), pte_change(), cov(), corr(), rank), var() skew(), apply(), ‘* Cae function filter data: groupby(), get_group(), merge(), coneat(), append(), melt), pivot) pivot_table() '* M@t 56 function Khae: get_option(), set_option(), reset_option(), deseribe_option(), op- tion_context()AL VIETNAM aivietnam.edu.vn Phan II: Bai tap ‘rong phin nay, cing ta s& sit dung pandas dé the hign mot s6 kg thuat phan tich tren bai bo dit ligu vé text va time-series. Cac cau bai tap dutgc chia thinh eée butée thy hién trong bai toan. A. Data Analysis with IMDB Movie data IMDB Movie dataset Ih mot b9 dit ligu dnb gid phim, ding dé phan tich mite 46 quan tam: cia phim theo mot s6 tiéu ehf mht: go di vien, tén phim... nbiim difa ra ning ge nhin, dit doan trong tufing lai, Ce ban tai bo dit liu IMDB-Movie-Data.csv tai day. Cae bude cin thu hign trong bai toan: 1. Read data . View the data Understand some basic information about the data 4. Data Selection ~ Indexing and Slicing data 5. Data Selection ~ Based on Conditional filtering, 6. Groupby operations Sorting operation View missing values Deal with missing values - Deleting 10, Deal with missing values - Filling 11, Apply() fimetions ‘Ta batt din thue bien va nhiin dink & mdi bude, code diye thue hign tren Google Colab: 1. Import libraries va load dataset: Dé doc mot file esv trong pandas, ta c6 thé diimg ham read csv nbitf san: : iapert auspy as ap 2 import pandas ae pa import natplotlib.pyplot as plt 5 dataset path = "IMDB-Hovie-Data.csv’ 7 # Read data fron .cev file * date = pd.readcav (dataset path) Neoai ra, ta ¢6 thé doc ding thai chi dinh cot Im ehi myc cho bang dit lieu (mae dink, pandas sé ty tao mot et chi myc rieng). O day, ta e6 thé chon et Title Iam e9t chi mye nba sau (e6t eh rue khong die ebvta gia tri tring lap): 1 # Read data with specified explicit inex if We vill tee thie later in our anelyese ) data_indexed ~ pd.read_cev(dataset_path, index.col="Title")AL VIETNAM aivietnam.edu.vn 2. View the dat: 1 Preview 2 date.head op 5 rove using head) Georebaerption crate are saarunsaacry sci “eight Aniveton Contr Feily sss Hin 5: Mot s6 miu dt lieu dau tien cita bo dit Lieu Desctor setors dares tn Tie rey Sosate data 3. Understand some basic information about the data: | flet’s firet understand the basic information about this data 2 data.intod “clots ‘pandas. core. frane, otaFrane’s Rongelndex: 1000 entries, & to 999 Date colunns Ceotel 12 columns: Non-all Count Deype # Colunn fons nite Gence Description Director etors| Year ating votes ° z 3 5 6 7 a 9 1 Hinh 6: Thong tin ev bin v8 bing dit ligu ear ROS tating 1800 non-null inte 1800 non-null object 1006 non-null object 2800 non-null object 1006 non-null object 1800 non-null object 1806 non-null inte Funtine CHinutes) 1000 non-null inte 1000 non-null Floats 1800 non-null inte Revenue (litlions) 872 non-null floctes 936 non-null Flot eypes: Float 643), int64Ce), objects) imerory usoge: 93.94 XB oi qua 5 hing dan tién cita bing ait liu bling cach sf dung hend() ss ques)AL VIETNAM aivietnam.edu.vn 1 data deserived Rank Yoor auntie (nutes) Rating Votes Rovenus Qlittions) Metascore count
[email protected]
¢060 1000.099000 —-1960.090008 1600.000000 1 .eeea0DeES 1572.009000 936 000000 eon $00,500000 2012. 783000, 113.172000 6.723200 1enaNe3e.05 52.956376 5.985043 std 288.619636 3.205962 18.10908 0.945429. 1,8876260105 we w.a7s7 25k 250.750000 2010.090000 109.000000 6.200000. 3,630900e,04 133.278000 7.000000 Sik 500.509000 2014. 090000 111.008000 6,809000. 1, 107298605 7.985000 59500000 7% 750.250000 2016. 000000 123.e00000 7.409008 2. 3900086,05 13. 71s000 72000000 ox 1000002000 2016.000000 391.000000 9.000000. 1,7219166+06 936.638000 180000000 nh 7: Ting quan thing ke 1 tit dataset day ta 06 thé thi © Gi tri min va max cita Year, tite dataset ehtfa ese ho phim tit 2006 t4i 2016. © Rating trung bin eho ese bo phim Ih 6.7, thip nhAt 1 1,9, eao nhit 9.0. # Doanh thu eao nhit dat dive 1 936.6 tr on dollar, 4. Data Selection — Indexing and Slicing data: Tit bing dif ligu, ta c6 thé tach bit Ia cot nio trong bang dit lieu dé tré thinh mot Series hoic mot DataFrame, tity vio phitimg thife tach ta sit dung. O day, ta sé tach mot $6 c6t trong data sit dung ky thoat Indexing. Dé téeh opt thinh Series, ta thtte hien: # Extract data as series genre = datal*Genre’l genre ° Action, Adventure, Sei-Fi 1 Adventure,Mystery, Sci-Fi 2 Horror, Thriller 3 ‘Animation Comedy, Family 4 Action, Adventure, Fantasy 995 Crine,Drana,Mystery 996 Horror 997 Drama, Music Romance 998 ‘Adventure, Conedy 999 Comedy, Fomily, Fantasy None: Genre, Length: 1000, dtyze: object Hinh &: Tich cot Gerne thanh mot Series Dé tach cot thanh DataFrame, ta thute hien: 1 # Extract data as davarrane aatalt Genre)AL VIETNAM aivietnam.edu.vn Genre Action, Adventure, Sci-Fi Adventure Mystery, Sci-Fi Horror, Thriller ‘Animation, Comedy, Fanily Action, Adventure, Fantasy @ 1 2 3 4 Crime, Drano, Mystery Horror Drana, Masi, Romance 998, Adventure,Comedy 999 Comedy, Family, Fantasy 1000 rons x 1 columns inh 9: Tich cot Gerne thinh mot DataFrame ‘Ta 06 thé chon va téch cimg mt lite nbiéu eO¢ vi nha, tao think mot DataFrame mi some.cols » data[{*Title’, Genre’, Actors’, "Director’, ‘Rating 1] DBéi wi vige tach hing, ta 06 thé téch ra mot s6 long hing abt diuh, tit chi mue X dén chi me Y trong bing dit lieu, goi data-iloc{10:15)C(/Title’, ‘Rating’ , "Revenue (Millions)? Két hop voi vige chon edt, ta e6 mot bang dif lieu gém ‘Title, Rating, Revenue (Millions) ‘Title Rating Revenue (Millions) 1 Fantostic Beosts ond Mhere to Find Then 7.5 234.02 n Wieden Figures 7.8 169.27 2 Rogue One 7.9 sa2.a7 2B Noona 7.7 288.75 16 colossal 6.4 Dar Hinh 10: Tich mot 6 cot tao thanh mot DataFrame mdi Is Slicing. Vi du, dé tich ede hing tht 10 dén tht 45, ta Tam mbit sau: 5 min dif lieu vai eae tritimg thong tin Data Selection — Based on Conditional filtering: Ta ed lay cao hang trong bang dif lien dita tren mot s6 didu ki cia tutu theo. Vi du, ta mong muéa léy eée bo phim tit 2010 toi 2015, vi rating nhé hon 6.0 nhitog lai e6 doanh thu thuge top 5% tren ton bo dataset. Theo d6, ta o6 thé trién khai eode mht sam: datel(Cdatal’Year'] >= 2010) & (dat & (datal Rating’) < 6.0) Year’) < 2018)) & (datal’Revenue (Millions)’} > datal’Revenue (Mil2ions)°I. quantile (O..95))1AL VIETNAM aivietnam.edu.vn fork Title Genre Description Director actors Year qfitXi2® aating votes fs 0 string Kristen tre * stort, wwitight fantasy "sterious david Rabert ao a se 942 MUSA sdventure,Orane Fantasy "™Gteriows bad TSHR zone zk 4.9 92740 teliose rips Taster Hin 11; Phim wi doanh thu cao trong giai doan nam 2010-2015 6. Groupby Operations: Groupby Ii mot phép gom uhém dit lieu dum txén mot bose nbidu bién (G day 1a edt dit ligu trong bing). Vi du, ta 66 thé tim sé rating trung biah ma ese dao dign dat Aitge bing each gom nhém eée chi s6 Rating efi eée bo phim theo Direetor + data. groupby (/Director’) [{/Rating’]] mean () mead 0 Rating Director amie Khan 8.5 Abdellatif Kechiche 7.8 ‘Aden Leon 6.5 Adan Mckay 7.8 ‘Adam Shankran 6.3 Hinh 12: Sit dimg groupby dé tim s6 rating trung binh dat dhige ciia ese dgo dién trong bo ait lien 7. Sorting Operations: Sorting cho phép ta sip xép cic hing trong bing dit lieu theo thit tr ting/giim din dua theo gid tri cita eot nto dé trong bing di lieu. Vi du, dita tren két qua groupby phi trite, ta 06 thé tim top 5 dyo dign dat s6 rating trang binh cao nhat nhut sate + data.groupby (*Director’) [[/Rating’}].mean().sort_values ({'Rating’], ascending= Falee) head) Rating Director Nitesh Tinari 8.80 Christopher Nolan 8.68 Olivier Nokache 8.60 Makoto Shinkai 8.60 Aamir Khan 8.50 Hinh 13: 5 dgo dién 06 duge s6 Rating trang binh cao nit, lig 8. View missing values: Ce bo d value) trong mot vai tring thong. vin d@ nay. Vi viy, vige dau tien ta thing sé xudt hign tinh trang bi gi tri réng (missing n cia mot s6 min di igu, Khi xi Iy dit lien, ta clin khde phe sn kiém tra xem vj tri bj mt-st dit Hig theo esc sane 1 # To check null values vow-wiee 2 date.ienul1Q.sun()AL VIETNAM aivietnam.edu.vn Rank Title Genre Description Director Actors Year Runtime (Winutes) Rating Votes Revenue (Millions) 12 Metascore dtype: intes g Tinh Ld: Bing tng sip $6 long efe gia tri null 66 trong titng oft eita bang dit liga G day ta thy Revenue (Millions) vA Metascore la 2 cot 06 chia. dit gu nnll, Dé sity vin d® mat rat dif ligu, e6 hai phuong én chin: hoae thé ese vimg trong bing mot gid txi nto d6 hove loai ching. 9. Deal with missing values - Deleting Déi vai phwong én loai ba, ta 06 thé loai ba tofin bo edt clnia uhigu gid tri ull (néu 66 thé) hoge chi logi bd ese hang ela gia tr] uu. Déi vdi x6a cdt, ta ‘thyte hien: 1 # Use drop function to drop columns 2 data.drop(*Metascore’, axis=1).nead x6a hang, ta dim aropna 10. Dealing with missing values - Filling: Déi voi phuong én thé gié tri moi vio ede 6 tréng, ta 6 thé st dung ese gia taj moan, median... eda e6t dit lieu tung ving dé thay thé (vige chon gi tri dé thay thé edn thy thude vio tinh chit elia bo dit lieu, bai tosn dang ii quyét...). VI du, 66 amgt vii bing ¢6 Revenue mang gia tx] null, ta ¢6 thé gn cho né gia te] trung ink nb sav: | revenue_nean = data_indexed{’Revenue (#il1ions)"J.mean() 2 print ("The mean revenue is: ", revenue_mean) | # We can £411 the null values with this sean revenue data_indexed (Revenue (Millions) ’].fillna(revenue_aean, inplace=True) 11, apply() functions: Apply functions diige ding ki ta: muén thye thi mot him nao 6 len ee hing trong bing cif ligu, Som khi thufe thi, két qua tra vé tit him chinh la. gid tri mdi cia hing txtang ting Vi du, ta: mudn phan loai phim theo ba mite do [Good’, *Average’, "Bad’] dua tren Rating, ta c6 thé dinh nghia mot him dé lam déu nay va apply n6 len DataFrame: 1 # Classity movies based on ratings 2 de rating_group Crating) > af rating >= 7.5 ‘ return 'Good’ + elit rating >= 6.0: * return ‘Average? > else 10AL VIETNAM aivietnam.edu.vn 7 return "Bad? # Lets apply this function on our movies data 1 # creating a nev variable in the dataset to hold the rating category 2 data[’Rating_category’] = data[’Rating']. apply (rating. group) 4s datal{'Title?,'Director’, Rating’, "Rating. category’]).nead(5) Title Director Rating Rating category @ Guardians of the Galexy Jones Gunn 8.3. Good 1 Prometheus Ridley Scott o Average 2 Split M. Night Shyamalan 7.3 Average 3 Sing Christophe Lourdelet 7.2 Average 4 Suicide Squed Dovid ayer 6.2 Average Hinh 15: DataFrame sau khi due apply ham rating_group(). nay sé duige dita vio mot cot mdi mang ten Rating category qui tri vé san khi thue thi hang uAL VIETNAM aivietnam.edu.vn B. Data Analysis with Time Series data ‘Time series data I mot dang di ligu vai gia tri dutge do ling tai nhitng diém khie nhan theo thd gian. Mot 56 dit ligu time series divge phan bé theo tin suit nit dink, vt dy nut tha tiét trong 1 eid, hong truy ep website trong nghy, téng doanh thu trong thing... Dif ligu time series efing o% thé phan bé vi khodng each khong déu, vi du nhit s6 lugng eude goi khin ep trong ngay hose nhat ky’ he théng, VARIABLE TIME mw erengie Finh 16: Minh dang dB thi tia dit Bi a time-series Trong phin nay, chring ta sé khai thac khia canh sip xép va trite quan héa di lg cho time series Cu thé voi dit Liou time series cho nang liemg, ta se Fim quen vi ap dung eta cée ky thnat time-based indexing, resampling, vi rolling. Vige nay s® gitip ta phan tich die ese khia eanh thong tin én quan trong trong dit liga, Vi du, Rolling windows ¢6 thé gitip ta khém pha cae bién thé vé nbn edu eign va cung eép ning ltang tai tao theo thi gian, Chiing ta diing bo dit ligu daily time series efin Open Power System Data (OPSD) 6 Dit, gdm tng hfdng tiéu tha ign, sim xnit-dign gio va sin xudt dign mat trai tren toln quée trong giai doan 2006-2017, Cae ban tai bo dit lieu opsd_germany _daily.esv tai day. Ching ta sé thyfe hién cée van 48 sau: ‘# Import libraries and read dataset ‘* Time-based indexing series data © Visualizing tin 1» Seasonality ‘¢ Frequencies ¢ Resampling # Rolling windows 12AL VIETNAM aivietnam.edu.vn * Trends Ching ta s8 Khim phé mie tieu thy va sin suit diem d Dite thay déi theo thai gian nhir thé nho, vi tri Idi ede eau bi ‘© Khi no mt tien thu din thurimg cao bit va thip nbit? ‘# Nang hong gi6 va mat trai dutge sin xudt da thay déi theo mita nhit thé nao? ‘© Xu hung di han trong tiew thy dign, ning lnting mat trai vA ming ling gis IA gt © So sinh tf 1é sin hitmg ning img gi6 va mst tris vél me tiéw thu nang emg gié vA mat tr vA ty Ie may da thay déi nhit thé nio theo thii gian? 1. Import libraries and read dataset: Din tien, ta vin diimg him read esv0 dé doe bing dit lieu: 1 import pandas as pa aataset_path = *./opsd_germany_daily..csv” # Read data from .cev tile ) oped_daily = pd. read_cav(dataset_path) © print (opsd_daily. shape) print (opsd_daily, dtypes) opsd_daily. head (3) ‘Ta dutye Két quit nhuthinh ben diéi, 66 thé quan sit thy nbidu gis tri bi bé tréng d ede cot Wind, Solar, Wind+Solar: (4383, 5) Dete object Consumption Floatés Wind Floated Solar Floaté4 WindsSolar — Float64 dtype: object Date Consumption Wind Solar WindsSolar @ 2006-01-01 1069.184 NaN NaN NaN 1 2006-01-02 1380.521 NaN NaN Non 2 2006-01-03 1642.33 NaN NaN NaN Mink 17: Mot sé dit lieu din tien etia DataFrame Oi véi dang dit lieu Time Series, ta.c6 thé chon cot Date lam index (v3 gia tri eot may trong bo dir lien Iuon Ih dy nhit (nnique)) \ opaddaily = oped_daily.set_index(/Date’) 2 opsd daily. mead (3) 13AL VIETNAM aivietnam.edu.vn Consumption Wind Solar Wind+SoLar Date 2006-01-01 1069.84 NaN NaN Nat 2006-01-02 380.521 NaN NaN Nat 2006-01-03 442.533 NaN NaN Nat Minh 18: Bing dif lien sau khi chon et Date lim index ‘Ta 06 thé thy hign Iai bude load file va Kc nay, chi dink Ot sé lam chi mye ngay tit Ie thf hign Idi goi him, déng thi too them cae 6 oat Year, Month, Weekday trich tit oat Date « thugn tien cho viee xit ly mot 56 bie ve sau: opsddaily = pd.read_ceu(’ opsd_geraany daily.csv’, andex_co1% » parse.datos=True) # Add columns vith year, month, and veekday name opsd_daily[’Year’] = opsd_daily. index.year opsd_daily[’Month'] = opsd_daily.index.month © opsd_daily(’Wookday Wane] = opad_daily index. day_name() # Dseplay a randon sampling of § rove ops4daily.sampio(5, random staté Consumption Wind Solar WindtSolar Year Month Weekday None Date 2008-08-23 152.011 NN NON. NoN 2008 8 + Saturday 2013-08-08 | 1291.984 79.666 93.371 173.037 2013. «= «8 =~ Thursday 1281.057 NON NON NoN 2009 = 8 Thursday 1391.050 81.229 160.641 241.870 2015 10 Friday 201.522 NaN NaN NoN 2009 6 Tuesday Hinh 19: DataFrame véi ede eot moi: Year, Month, Weekday 2. Time-based indexing: Mot trong ning tink ning néi tri efia pandas ki xt Iy dit Leu time: series IA tinh nang time-based indexing, lien quan dén vige dimg dates va times dé ¢6 ebste va tray cap ait lien (kha giéng vai Indexing 6 phin truide nhvtog gia tr] Iie nay sé Ii ngiy thing nam) Vige niy cho phép ta diing loe accessor dé thite thi. Vidu, ta e6 thé tray ep dif lieu theo mot Khofing thoi gian tit ngay 2014-01-20 dén ngay 2014-01-22: opsd.daily. loc! 2014-01-20": ?2014-01-22"1 Consumption Wind Solar WindsSolar Year Month Weekday Nane Date 2034-01-20 1590.687 78.647 6.371 85.018 7014. «1S ‘Monday 20i4-01-21 1624.86 15.643 5.835 21.478 7e14 «1S Tuesday 2ei4-01-22 1625.155 60.259 11.992 72.251 2814 —«1—ednesday Hinh 20: LAy eée man di liga ti ngage 20/1/2014 eén 22/1/2014 ‘Mot tin ning khae ciia pandas Ia partial-string indexing, cho phép ta gian mot esch chung chung, khong licing, theo mo ta thei neu thé ngay thing nam nut 6 phin tren, Vi du: vyAL VIETNAM aivietnam.edu.vn 1 opsd_daiy. Loc [2012-0273 Consumption Wind Solar Winé:Solar Year Month Weekday Name 181.866 199.607 43.502 243.109 2012 ednesday 2 1863.407 73.469 44.675 118.144 2012, -2-~— Thursday 163.631 36.352 46.510 82.862 2012-2 Friday 2 2 zo2-02-e& © 1372.614 20.551 45.225 65.776 2012 Saturday 2012-02-@5 279.432 55.522 54.572 110.094 2012 Sunday Hinh 21: Partial-string indexing. Voi vige chi dat "2012-02, ta cf thé My duge ton b9 ee mn dit ligu thuge "2012-02" 3. Visualizing time series data: Voi viee pandas 6 6 trd trte quan héa dit ligu len dd thi, phi hop wi thit vign seaborn ta ¢6 thé dé ding tre quan héa duige dit ligu time-series len dé thi, Vi dy, ta trve quan (plot) dit lign cot Consumption nbtt san: | isport matplotisb.pyplot as ple 2 # Display figures inline in Jupyter actebook | Iapert seaborn as sas 1 # Use seadorn style defaults and set the default figure size © sng. set(re=C rigure, tigsize’":(11, 4) + opsd_daiiy( Consumption’) plot (Linewiath=0.8); 1600 3 ‘consumption g 8 2006 2008 210 2012 21s 2016 Date inh 22: Dé thi dit lign vé mie tien thu dien nang hhng ngiy tai Dite ‘Ta e6 thé plot eiing tie mot 96 edt dif ligu khée thimh titng d6 thi riéng le: 1 cols_plot = (*Consumption’, *Solar’, *Wind’] u ares = opsd_daily [cols plot]. plot (marker=’.', alphas0.S, Linestyles/Nlone’, figsizes(11, 9), subplots=True) | tor az in as ‘ax, set_ylabel (*Daily Totals (avn) *) + pat.snow0 15AL VIETNAM aivietnam.edu.vn gue B 200 Zuo bai as omy EEa 8888 oe Daly Tals wm nb 2 1 thj vé mvc tien thy dign, sin Ingng dig nang tie mat toi va sin Itong dien nang tit gi6 4. Seasonality: Tam dich: tinh thai vy, Chi sé v8 cae de ting lip di l§p Iai trong mot kho’ing, thai gian eS dinh xuyén suét céc nim. Cac dang dic trumg nay thettng ditge anh huéng bai rit nhigu yéu té khée nhan. O trong dit lign ela bai, ta o6 thé khai ph tinh thii vu eta dt lieu, ding seaboru dé vé, va group dit lieu thinb timg ubém nt san: fig, axes = plt.eubplote(2, 1, figeize=(11, 10), eharex=True) for name, ax in rip(['Consvaption’, 'Solar’, 'Wind!], axes) sus boxplot (datasopsd_daily, x=/oath’, yename ax get _plabel (Gun) ax aet_titie (aame) 4% Renove the aUtonatic x-axis Label fron all but the bottom subplot at ax ss (-1) zlapel (77) arsax) 16AL VIETNAM aivietnam.edu.vn Consumption co Ttnaeeaseoe gare. Wed Sbibbbebddebt © 82 8 Hinh 24: Biéu dién phan bé eta eae cot Consumption, Wind, Solar theo Month 5, Frequencies: Trong Datetimelndex evia pandas, ta 06 thé sit dung cae gia tri thi gian sfin 06 6 tao thinh mot chudi gia tri theo tan suit. Vi dn, vai hai gia tri "1998-03-10" va "1098-03-14, ta 66 thé tgo mot danh sich thoi gian voi tn suit theo ngay. Trfe dank sich moi eta ching ta ‘rd thinh: 7198-03-10", "1998-03-11", "1998-03-12", "1998-03-13", "1998-03-14". Vide nay dutse thite bien bing ech eai dat thuge tinh “freq pa. date range(?1988-03-10", "1998-03-15", freq="D*) Datetimerndex(['1998-03-10", '1998-03-11", '1998-03-12", 1998-03-13", *1998-03-14" '1998-03-15"], dtype='datetineG4{ns]", freq='D') Minh 25: Vi dy vé ly tan snit theo nghy tir 10/3/1998 dén 15/3/1998 ‘Voi tinh nang nay etia pandas, ta 66 thé thite hién vige thé dit eu bj mat bing ky thuat forward fill (ffl). Ky that nay lien quan dén vige sit dung gli tri ghi nhan difge tai thai digm trude 46 um gif tri thay thé cho tohn bo gis tri bi mt mst san dé trutde khi gap duige min dit lieu 6 gis tri, Vi du, gia sit ta biét dutge git tri Consumption eiia mot vai nghy nhut san: # To select an arbitrary sequence of date/tine values fron a pandas tine series, # ve need to use a Datetinelndex, rather than simply a list of date/tine strings times_sanple = pd.to_datetime([?2013-02-03", '2013-02-06", '2013-02-08']) # Select the specified dates and just the Consumption columa 7AL VIETNAM aivietnam.edu.vn + consum_sample = opsd_daiiy.loc{tines.sample, [’ Consumption’11, copy Q © consum sample Consumption 2013-02-03 1109.639, 2013-02-06 1451.49 2013-02-08 143.098 Hinh 26 Lay dit ligu ciia 3 ngay trong bp dit lieu gbe lam vi dy mine 1 # Convert the data to daily frequency, without filling any nissings consum_treq = consum_sanple.astreq('D:) # Create a column with missings forvard filled consum_freql’ Consuxption - Forward Fill‘] = consun_sample.asfreq(’D’, methods" fi) consum_treq Consumption Consumption - Forward Fill 2013-02-03 109.639 109.639 2013-02-04 Nott 109.639 2013-02-05 NaN 109.639 2013-02-06 1451.49 1451449 2013-02-07 NaN 451.449 2013-02-08 143.098 1433,098, Hinh 27: Thue hign fill vio cae ngay khée trong pham vi tit ngay 3/2/2013 dén 8/2/2013 Voi gid tri tiew thy dign nang etia 3 celia 3 ngay tren sit dung fill ngiy, ta 66 thé thé gia tr] cho ee ngdy odn Iai trong phar vi nig dé thay déi tin x6 6, Resampling: Li ky thuat ign eiia bo dit lign time series, 6 thé gia tang hose gin di tan 1, ta 6 thé gidm tan 56 cita bo dit lieu hign tai tit mzay sang thing. Diéu ny ding nghia vai viee bo dit iew mai eita chiing ta sé €6 st min dit lien hon bain g6ec Resampling thittng hvu dung véi time series cho lower hoe higher frequency. Resampling cho lower frequeney (downsampling) thing lien quan tdi hoat dong téng hap, vi du mite doanh thu trong thing tit di ligu ngay. Resampling cho higher grequency (upsampling) it phé bién hon, thutmg ding trong vige noi suy. G day, ta thit ap dung downsampling cho bg dit ligu hign tai nlut # specity the data columns ue want to include (1.¢, exelude Year, Month, Weekday ane) 2 data_columns = (/Consuaption’, *Wind’, ’Solar’, ‘Wind+Solar’] # Reza le to weekly frequency, aggregating with nean opsd_veskly_mean - opsd_daily [aata, opsd_ eekly mean head (3) olumns]. resample (’?).mean ©) G doan code trén, ta downsampling tit tin s6 theo nga 1a trung binh elia 7 nghy trong tua sang thin, Gid tri cita ete e6t Iie ny sé 1sAL VIETNAM aivietnam.edu.vn Consumption Wind Solar Wind+Solar Date 2006-01-01 1069.184000 NaN NaN Non 2006-02-08 1381.300143 NoN NaN Not 2006-01-15 1486.730286 NaN NaN Not Tinh 28: Sit dung kp thuat Resampling dé déi tn s6 lay miu eiia bo dl igu tit ugiy sang tui Di nhien, khi ta downsampling b@ dit ligu, s6 Ing min dit lieu eta bing dit ligm mdi sé it hon so wi bing thon 1/7 Lin, Co thé kiem tra bang cach dimg thude tinh shape etia DataFrame: print (opsa_daily shape (0]) Print (opsd_veokiy mean. shape (0]) ‘Ta visualize daily va weekly time series ciin Solar trong 6 thing nhit sau # start and ond of the date range to extract start, end = '2017-017, '2017-06" # Plot daily and veekly resampled tine ceries together fig. ax = plt.subplots() ax. plot (opsd_daily-loc(start:end, 'Solar’], narker='.', linestyle='-", linewidth=0.5, label='Daily’) fax. plot (opsd_week1y_nean.octstart :end, *Solar’] + marker='0*, markersize=8, Linestyle='-', label="Weokly Mean Resampie’) ax. set_ylabel (Solar Production (cvh)*) fax. legend | pie show zo = bay : . cere 1 oofha ahh a) 5 gu B wo E Zs ° Hinh 20: Da thi Time series iia Solar theo nghy va theo tin Lit ¥ sng bing dit lieu gée ciia chiing ta 06 tn mot s6 gia tri null. Vi vay dé dam bio toin bo ese miu ¢6 gis tri, ta chi dit tham s6 min count vio dé sir ly vin dd nay. Vi du, ta resampling bb dit lidu thinh theo nim, dé dim bio ede ngay trong nam déu tdn tai git non-null, ta ot cai dst min _count=360 (ese ban 6 thé chon min_ecount bing wot gia tri Khie tay vo quan sit 4 nan): 1 # Compute the annual sume, setting the value to NaN for any year vnich hae 2 # fever than 960 daye of data » opsd_annual = opsd_daily(data columns] .resanple( YE’). sun(min_count~360) 19AL VIETNAM aivietnam.edu.vn ¢# Tho dofauit index of the resampled DataFrane is the last day of each year, 5 # ('2006-12-31", '2007-12-317, etc.) so to make life easier, set the index \# to the year component > oped_anaual = oped_annual.2et_index Coped_annual index. year) { opadanaual.index.aane = ‘Year? | # Compute the ratio of #indsSolar to Consumption to oped_annual[Wind+Solar/Concumption’] = oped_annual [’Wind+Solar’} / oped_annuall’ Consunption’] 41 opsd_anaual. tai (3) Consumption Wind Solar WindsSolar WindsSolar/Consumption Year 2015 505264.56300 77468.994 34907.138 112376.132 0.222410 2016 505927.35400 7008.126 34562.824 111570.950 0.220528 2017 504736,36939 102667.365 35882.643 138550.008 0.274500 Fink 30: Annual resampling voi bo dit Tigw hien tai "Ta 06 thé ve bién dd hién thi sin ligng sin xu4t ning ling gi6 vA mat trai déng gop vo mite 4 tien thu dign nang ké tit nim 2012 nbit san: 1 # Plot fron 2012 onvards, because there is no solar production data in earlier years fax = opsd_annual loc [2012:, *Vind+Soler/Consunption'}, plot. bar (color=' C0") fax. set_ylabel (Fraction?) fax set ylin(o, 0.3) ax set_title (Wind + Solar Share of Annual Electricity Consumption’) : pit _xticks (rotation=0) Wind + Solar Share of Annual Electricity Consumption 030 025 020 & £ ors & 0.05 0.00 2012 2013 2018 2015 216 2017 Year Munk 31: Bidu dé o9t biéu thi Solar + Wind dong gop vio me tieu thu dign nang 7. Rolling windows: Rolling window efing 1A mot hoat dng chuyén thong tin quan trong trong dit lien time series. Giéng downsampling, rolling windows chia dif ligu thinh ede time windows (ede Khong thai gian nhit tuin, thang... dulge trifat tren cic miu dif ligu thoo ngiy) va dit len trong mii window d6 duge téng hop vi him mean(), median(), sum(),... Tuy nhien, khong gidng mbit 20AL VIETNAM aivietnam.edu.vn doveusampling, khi ma dit ligu khong overlap nan va output Indn €6 tn s6 th4p han inpat, rolling vsindows overlap va gom thinh nbting dit len c6 cing tin 96, vi thé time series diée chuyén ¢6 cing tn 96 voi time series gc. Ta vi du voi rolling trong 7 ngay: 1 # Compute the centered 7-day rolling mean 5 opsd.Td = opsd_daily {data colunns).rolling(7, centersTrue).mean() + oped74. head (10) Consumption Wind Solar Wind+Solar Date 2006-01-01 NoN NoN NaN Non 2006-01-02 NoN NaN NaN Now 2006-01-03 NoN NaN NaN Non 2006-01-04 1361.471429 oN NaN Now 2006-01-05 1381.300143 NoN NaN Now 2006-01-06 1402.557571 NoN NaN Non 2006-01-07 1421.754429 NON NaN Non 2006-01-08 1438,891429 NoN NaN Non Hinh 32: Rolling windows véi chu kj 7 ngay G day, 2006-01-01 dén 2006-01-07 dutge dnh nhan 1 2006-01-04, 2006-01-02 dén 2006-01-08 duige inh nhitn La 2006-01-05, tung tir cho ee démg khuie 8. Trends: La mét die trimg chi sm hing eta di Hien, 06 thé tang hose giim di trong mot khong ‘thdi gian dai. Véi ky thuat rolling windows, ta 06 thé dé dang true quan héa trends cia bo dit iéu, tai cdc time scales khac nhau. Vi dy, ta tinh 365-day rolling mean: inport matplotlib.dates as ndates # The min_periods=360 argument accounts for a fev isolated missing days in the # vind and solar production tine series opsd_3654 = opsd_daily (aata_colune] .rolLing(windou=365, conte 160) -mean) ruc, min_periods # plot daily, 7-day rolling = s fig, ax » plt subplots 0 ax. plot (opsd_daily{’Consumption’], marker=’.’, markersize=2, color~0.6", Linestyle*’None', Label" Daily’) fax. plot (opsd.74{* Consumption’), Linevidthe2, label='7-d Rolling Mean’) 2 ax. plot (opsd_365d[*Consunption*], color="0.2*, linewidth=3, s label='Trend (365-d Rolling Mean) ’) | # Set x-ticks to yearly interval and add legend and Labels az.zaxis. set _aajor_locator (ndates. YearLocater ()) © ax. legend + ax.set_rlabel (’ Year’) * ax-set_ylabel (Consumption (GWh)?) ax set_titie (Trends in Electricity Coneunption’) 2 pit. show() fm time series wn, and 365-day rolling mAL VIETNAM aivietnam.edu.vn ‘Tends in Electricity Consumption 1600 = S soo a E 1200 5 1000 ear Hinh 33: Xu hiténg tiéu thy dign, twin vA nm, tng manh vio endi nam 1 # Plot 365-day rolling 4 o fig, ax » plt subplots 0 ) for na in ['Wind’, *Solar’, *Wind+Soler’] 1 ax.plot Copsd_365d (ne), label=na) 5 # Bot x-ticks to yearly interval, adjust y-axis limits, add legend and labels © ax-zaxis.set_major_locator (ndates. YearLocator ()) set_ylin(0, 400) Legend © jet_ylabel (Production (GW) ') Ttitle(/Trends in Electricity Production (365-4 Rolling Means)’) series of vind and solar pover ‘Trends in Electricity Production (365-d Rolling Means) eran (ne gs 100 ee zou 212 213 2s ais 2016 217 Hinh 34: Xu hiténg sin sendt ning luiong dien gi6 va mat trai c6 su hitimg tang qua bing nim, dae biet Ja mang hieng ai Nhut vay voi mot s6 bude thute hin tren, ta da xem qua each sip xép, phan tich va trate quan héa ait ligu time series data trong pandas, dig céc k¥ thnat nbit time-based indexing, resampling, rolling windows. Ap dung ky thuat nay sto bo dataset OPSD, thu ditde ese thong tin chi tiét we thai diém, ese kj, va xu huting trong sin xnft va tien thu dien. 2AL VIETNAM aivietnam.edu.vn Phan III: Cau héi A, Phan trée nghi¢m 1. Data Analysis Ia gi? (a) Qué trink thu thip dit lieu, (e) Qué tinh seit Iy it Lien (©) Qué trinh tim kim via Khai thie a ién, () Cae phitong én tren dé dking, 2. Cin tte dit liga mio sat day khong thude pandas: (a) Series (6) Panel (0) DataFrame (a) Tensor 3. ¥ ngbta ciia phutong thife head() déi véi bing dit ligu trong pandas I (a) Hién thi ede hing eudi cing (c) Hién thj ngiu nhion mot sé hang (0) Hign thi cae hang an tien (a) Hin thj tat eee ng 4. ¥ nghia ciia phutong thife describe() déi wai bang dit lien trong pandas li: (a) Bing théng ké cia cae cot dtt ign string (c) Bing théng ke cia ee e@t ait ligu list (8) Bing théng ke etia cae cot dit lien 56 (a) Bang thing ke eiia cae cot dit lien dict 5. Phuong thife nio san day duige dimg dé doc mot file esv tit bo nbé trong pandas? (a) pd.load_esv() (e) paread_file() (®) paread_esv() (a) paload_file() 6. ¥ ngbia ciia phutong thife groupby() déi véi bing dit lign trong pandas li: (a) Loe eée hing theo itu kien (6) Néi cae bing ait ign (8) Ting hyp théng ke ee eot ate lew (a) Gom nhém ait lien theo wi tei cia mét ho&e nhieu et 7. Phuong thite nao sau day ding dé kiém tra ede gia tr] NaN 06 trong bing dit lieu? (a) abisna() (6) Afnotmnll) (8) dfmotna() (a) f.tail() 8. Phuong thie nio sau day e6 thé dive ding dé bé di mot hing ¢6 gid tri null trong bing dit lieu? (a) df.drop_null() (6) dfdropna() (b) df-drop_missing() (a) dfremove_null() 9, Phutcng thiie nao san day trong pandas dite ding dé fill ede gid tri bi mat trong bang dit lieu sit dung ky thuat forward filling?AL VIETNAM aivietnam.edu.vn (a) fillna{method="bfill’) (6) fillna(method="fil!) (0) fillna(method="pad’) (a) filina(metho forward’) 10. Phuong thrfe nao sau day trong pandas dirge ding dé resample dit leu? (a) resample() (6) rednce() (0) downsample() (a) shrink() 11, Phuong thite nao san day trong pandas duige ding dé tink rolling windows? (a) rolling() (e) average() (®) mean() (a) smooth() 12, Phnfomg thite nio san day trong pandas drige ding thy thi m@t ham bit ki vao tat ea phin tit trong mot Series? (a) pa Series.transform() (c) paSeries:map() (©) pa Series applymap() (a) pdSeriesapply() 18, Phitong thite nao sau day c6 thé ditge ding dé lay toan bo mot oot sit dung tén cia n6 tit bing att lieu? (a) affect (6) atix{eot] () at-loefeo!] (a) Atitoe(cot] Xem qua bing dif liéu sau day (df) vA tra Iai eée cu hoi dudi day: Date Open [High Tow Close [Volume [Adj Close {6/20/2010 | 9.000000 | 25.0000 | 17.540001 | 23.880000 | 18766300 | 23.880000 6/30/2010 | 25.790001 | 30.420000 | 23.299099 | 23.830000 | 17187100 | 23830000 771/2010_| 25,000000 | 25.920000 | 20-270000 | 21.959909 | S2I8800_| 21,959099 7/2/2010_| 23.000000 | 23.100000 | 8.709990 | 19:200001 | 5139800 | 19.200001 7/6/2010 _| 20,000000 | 20.000000 | 1.830000 | 16.11000T | 6865000 | 16.110007 14, Dong lenh ndo sau diy ding dé chon ese hing ¢6 gis tri *Close” lon hon 25? (a) affatyClose'] > 25] (6) Afiiloc{atfClose’] > 25) (8) al’ Close’ (a) affatClose > 25] 15. Dong lenh nao san day ding dé chon cdc hing 06 gid tri "Volume nhé hen hoae bing 1000000? (a) at.itocfafl’ Volume’ (8) alfa Votume’] 1000000] (@) affat.Volume <= 1000000] 1000000] (a) aff'Volume}] <= 1000000 16. Dong inh nio san diy ding dé chon ee hing e6 git] "High’ nbé hon hose biing "Low"? 24AL VIETNAM aivietnam.edu.vn (a) atiloc(aHigh’] <= df[Low'] (6) affatsHigh <= df. Low] (0) alffatfHigh’) <= atPLow}) (a) aff High] <= aff’Low’] 17, Dong lenh nao san day ding dé tim gi tri trang Dinh efia e0t ‘Close’? (a) df.mean() (6) AfClose"| sum() (0) af{’Close'}.nean (a) affClose’}.mean() ~ Hét -
You might also like
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6386)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (634)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1160)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (983)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8302)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (633)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1254)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (933)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (10337)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (887)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1007)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (3237)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5058)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4346)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (458)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2091)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1993)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1077)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2780)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2032)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2838)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4086)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (76)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (906)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2544)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (813)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (277)