Slate Star Codex (SSC) is a blog by the pseudonymous author Scott Alexander. The blog is a fixture of the "rationalist" community, an internet community that seeks to analyze difficult problems in science, culture, and politics with high standards of reason. For example, almost every SSC post features citations from peer-reviewed papers and some degree of statistical reasoning. Despite being pretty unknown to the general public, Scott's writing has a sizable impact on academics, entrepreneurs, and tech professionals across the globe. When a reporter at the New York Times threatened to dox Scott in an article about the blog, thousands, including many influential public figures, signed a petition to halt the article*. Each year, Scott conducts a survey of his readers and publishes the results online*. I thought it would be profitable to shed some light on the nature of this unique online community. Herein, I look at the demographics of those who participated in the 2020 survey. In particular, I am interested in the demographic breakdown of responses to the question "If you could vote in the US Democratic primary, who would you vote for?"
As you will see in the iPython notebook, the readers of SSC are very unrepresentative of America, in interesting ways. Of 4361 American responders, the readers are vastly male and white. The average age is around 35. To my surprise, over a third of readers are married, though most have zero children. Most work white collar computer-related office jobs. Most identify as atheist, despite most being raised in religious environments. For the most unrepresentative statistic, the plurality voted for Andrew Yang in the Democratic Primary, with Elizabeth Warren and Bernie Sanders in second and third, and Biden in a distant sixth. I also looked at self-reported IQ (based on a scientific test administered by a professional) and SAT scores.
In this first-pass analysis, I used group_by to look at the means and most frequent responses per selected candidate. I chose means rather than medians because many responses are very skewed, thus the mean is more informative. It appears that the demographic breakdown per candidiate mostly corresponds to the mean or most frequent response to each question for the sample as a whole (e.g. the most common profession for each of the top six candidates is "Computers," which is by-far the most common profession for the sample as a whole). There are some notable deviations, with voters for Tulsi Gabbard being older and having more children than voters for other candidiates, in addition to being more likely to be theists. Further, voters for Andrew Yang and Bernie Sanders are more likely to be single, perhaps due to the more radical natures of their campaigns (though radical in different ways). In a future analysis, it would be useful to not only look at the mean or most frequent response, but also the distributions of each category per presidential candidate. Qualitative variables could also be dummy-coded in order to examine them quantitatively (e.g. % male, % white).
*https://www.dontdoxscottalexander.com/