Continuous Variables, Pt. 2
Continuous Variables, Pt. 2
2
Weekly Savings
A B C D
0.909 -95.249 -95.249 -195.2
1.496 -75.119 -75.119 -175.1
2.024 -61.774 -61.774 -161.8
2.405 -46.180 -46.180 -146.2
3.272 -39.818 -39.818 -139.8
4.769 -37.617 -37.617 -137.6
5.578 -22.619 -22.619 -122.6
6.647 -16.224 -16.224 -116.2
7.927 -6.186 -6.186 -106.2
10.859 -4.291 -4.291 -104.3
12.035 -3.251 -3.251 -103.3
13.924 -2.036 -2.036 -102.0
14.074 -1.523 -1.523 -101.5
14.227 0.577 0.577 -99.4
14.580 8.376 8.376 -91.6
16.234 10.078 10.078 -89.9
18.852 13.598 13.598 -86.4
19.980 16.915 16.915 -83.1
A B C D
24.440 17.470 17.470 -82.5
25.109 18.648 18.648 -81.4
25.676 20.098 20.098 -79.9
25.867 24.333 24.333 24.3
26.000 28.198 28.198 28.2
28.535 31.104 31.104 31.1
29.543 31.805 31.805 31.8
30.478 32.744 32.744 32.7
30.648 35.035 35.035 35.0
39.095 37.773 37.773 37.8
40.210 40.510 40.510 40.5
47.266 40.707 40.707 40.7
51.398 41.001 41.001 41.0
52.306 45.793 457.933 45.8
57.083 48.475 484.753 48.5
58.269 49.300 493.004 49.3
65.167 49.784 497.842 49.8
65.548 52.184 521.844 52.2
73.493 52.619 526.191 52.6
73.726 54.153 541.527 54.2
A B C D
74.934 55.683 556.831 55.7
82.537 59.798 597.985 59.8
85.918 62.602 626.023 62.6
92.275 65.141 651.414 65.1
95.689 65.371 653.706 65.4
104.578 70.356 703.560 70.4
124.599 76.699 766.989 76.7
192.962 78.215 782.148 78.2
194.340 103.500 1034.998 103.5
199.995 109.222 1092.217 109.2
249.964 119.499 1194.992 119.5
302.121 128.147 1281.472 128.1
350.536 139.366 1393.657 139.4
416.852 163.109 1631.089 163.1
Histograms
Boxplots
Boxplot (Person “D”)
## min lower-hinge median upper-hinge max
https://www.explainxkcd.com/wiki/index.php/539:_Boyfriend
What does it take to be an outlier?
Source: http://www.boxofficemojo.com/weekend/chart/
What does it take to be an outlier?
What does it take to be an outlier?
fences:
fences:
## [11] 3.303 4.674 4.755 5.735 9.110 13.127 13.203 15.861 18.238 31.003
Sometimes boxplots are drawn using the IQR (interquartile range) instead of hinge spread
base R vs. ggplot2
boxplot(mtcars$mpg) geom_boxplot() +
theme_grey(14)
boxplot stats
# ggplot2
# base R
g <- ggplot(mtcars, aes(y = mpg)) +
boxplot.stats(mtcars$mpg) geom_boxplot()
ggplot_build(g)$data[[1]][,1:6]
## $stats
##
ymin lower middle upper ymax outliers
## $n
## [1] 32
10.4 15.4 19.2 22.8 32.4 33.9
##
## $conf
## $out
0.000030
0.000025
0.000020
density
0.000015
0.000010
0.000005
0.000000
geom_density(color = "red")
geom_density(color = "red") +
Source: https://eagereyes.org/blog/2017/joy-plots
Additional resources:
http://blog.revolutionanalytics.com/2017/07/joyplots.html
https://blogs.scientificamerican.com/sa-visual/pop-culture-pulsar-origin-story-of-joy-division-s-
unknown-pleasures-album-cover-video/
Ridgeline plot inspiration
Jocelyn Bell discovers first radio pulsars, 1967
Ridgeline plot
Ridgeline plot, change scale
Histogram vs. ridgeline
Ridgeline vs. boxplot
Source: https://twitter.com/lenkiefer/status/916823350726610946
ggridges package
CRAN https://CRAN.R-project.org/package=ggridges
Github https://github.com/clauswilke/ggridges
https://cran.r-project.org/web/packages/ggridges/vignettes/gallery.html