The goal of this project is to use a topic modeling approach to estimate the influence that different news firms have in a media network. I use a novel dataset of tweets taken from roughly 2000 newspapers in the United States and use Latent Dirichlet Allocation (LDA) and Network Latent Dirichlet Allocation (nLDA) to estimate a measure of influence between these firms. This measure of influence is then used to determine which firms are the most significant for the proliferation of news events as to develop a measure of the quality of different firms based on who trusts their message. Finally, I am able to use this measure of influence to determine the network structure of my media network and determine the optimal sampling of news firms required to maximize the potential for observing a novel news event.
The repository for this project is divided into two subdirectories, paper, which contains the LaTeX for the accompaning paper and code, which contains the code necessary to implement the work done here.