Cyberbullying Identification Using Participant-Vocabulary Consistency
Cyberbullying Identification Using Participant-Vocabulary Consistency
Abstract
With the rise of social media, people can now
form relationships and communities easily re-
gardless of location, race, ethnicity, or gender.
However, the power of social media simulta-
neously enables harmful online behavior such
as harassment and bullying. Cyberbullying is
a serious social problem, making it an impor-
tant topic in social network analysis. Machine
learning methods can potentially help provide
better understanding of this phenomenon, but
they must address several key challenges: the
rapidly changing vocabulary involved in cyber- Figure 1. Survey Statistics on Cyberbullying Experiences. Data
collected and visualized by the Cyberbullying Research Center
bullying, the role of social network structure, and
(http://cyberbullying.org/).
the scale of the data. In this study, we propose a
model that simultaneously discovers instigators
Research Center defines cyberbullying as “willful and
and victims of bullying as well as new bullying
repeated harm inflicted through the use of computers, cell
vocabulary by starting with a corpus of social
phones, and other electronic devices.” Like traditional
interactions and a seed dictionary of bullying
bullying, cyberbullying occurs in various forms. Examples
indicators. We formulate an objective function
include name calling, rumor spreading, threats, and sharing
based on participant-vocabulary consistency. We
of private information or photographs.1,2 Even seemingly
evaluate this approach on Twitter and Ask.fm
innocuous actions such as supporting offensive comments
data sets and show that the proposed method can
by “liking” them can be considered bullying (Wang et al.,
detect new bullying vocabulary as well as victims
2009). As stated by the National Crime Prevention Coun-
and bullies.
cil, around 50% of American young people are victimized
by cyberbullying. According to the American Academy of
Child and Adolescent Psychiatry, victims of cyberbullying
1. Introduction have strong tendencies toward mental and psychiatric dis-
Social media has significantly changed the nature of so- orders (American Academy of Child Adolescent Psychia-
ciety. Our ability to connect with others has been mas- try, 2016). In extreme cases, suicides have been linked to
sively enhanced, removing boundaries created by location, cyberbullying (Goldman, 2010; Smith-Spark, 2013). The
gender, age, and race. However, the benefits of this phenomenon is widespread, as indicated in Fig. 1, which
hyper-connectivity also come with the enhancement of plots survey responses collected from students. These facts
detrimental aspects of social behavior. Cyberbullying is make it clear that cyberbullying is a serious health threat.
an example of one such behavior that is heavily affecting Machine learning can be useful in addressing the cyber-
the younger generations (Boyd, 2014). The Cyberbullying bullying problem. Recently, various studies considered
2016 ICML Workshop on #Data4Good: Machine Learning in
supervised, text-based cyberbullying detection, classifying
Social Good Applications, New York, NY, USA. Copyright by the 1
http://www.endcyberbullying.org
author(s). 2
http://www.ncpc.org/cyberbullying
46
Cyberbullying Identification Using Participant-Vocabulary Consistency
social media posts as ‘bullying ’or ‘non-bullying’. Training the most frequent occurrences of cyberbullying. To evalu-
data is annotated by experts or crowdsourced workers. ate the participant-vocabulary consistency method, we ran
Since bullying often involves offensive language, text- our experiments on Twitter and Ask.fm data. From a list
based cyberbullying detection studies often use curated of highly indicative of bullying key phrases, we subsample
swear words as features, augmenting other standard text small seed sets to train the algorithm. We then examine the
features. Then, supervised machine learning approaches participant-vocabulary consistency method to see how well
train classifiers from this annotated data (Yin et al., 2009; it recovers the remaining, held-out set of indicative phrases.
Ptaszynski et al., 2010; Dinakar et al., 2011). Additionally, we extract the detected most bullying phrases
and qualitatively verify that they are in fact examples of
We identify three significant challenges for supervised
bullying.
cyberbullying detection. First, annotation is not an easy
task. It requires expertise about culture, examination of
the social structure of the individuals involved in each 2. Related Work
interaction. Because of the difficulty of labeling these of-
There are two main branches of research related to our
ten subtle distinctions between bullying and non-bullying,
topic. One of them is online harassment and cyberbullying
there is likely to be disagreement among labelers, making
detection; the other one is associated with automated
costs add up quickly for a large-scale problem. Second,
vocabulary discovery. Various studies have used fully
reasoning about which individuals are involved in bullying
supervised learning to classify bully posts from non-bully
should do joint, or collective, classification. E.g., if we
posts. Many of them focus on the textual features of
believe a message from A to B is a bullying interaction,
post to identify cyberbullying incidents (Dinakar et al.,
we should also expect a message from A to C to have
2011; Ptaszynski et al., 2010; Hosseinmardi et al., 2015;
an increased likelihood of also being bullying. Third,
Chen et al., 2012; Margono et al., 2014). Some of them
language is rapidly changing, especially among young
use other features than only textual features, for example
populations, making the use of static text indicators prone
content, sentiment, and contextual features (Yin et al.,
to becoming outdated. Some curse words have completely
2009), the number, density and the value of offensive
faded away or are not as taboo as they once were, while new
words (Reynolds et al., 2011), or the number of friends,
slang is frequently introduced into the culture. These three
network structure, and relationship centrality (Huang &
challenges suggest that we need a dynamic methodology to
Singh, 2014). Nahar et al. (2013) used semantic and
collectively detect emerging and evolving slurs with only
weighted features; they also identify predators and victims
weak supervision.
using a ranking algorithm. Many studies have been applied
In this paper, we introduce an automated, data-driven machine learning techniques to better understand social-
method for cyberbullying identification. The eventual goal psychological issues such as bullying. They used data sets
of such work is to detect such harmful behaviors in social such as Twitter, Instagram and Ask.fm to study negative
media and intervene, either by filtering or by providing user behavior (Bellmore et al., 2015; Hosseinmardi et al.,
advice to those involved. Our proposed learnable model 2014a;b).
takes advantage of the fact that the data and concepts
Various works use query expansion to extend search
involve relationships. We train this relational model in a
queries to dynamically include additional terms. For
weakly supervised manner, where human experts provide
example, Massoudi et al. (2011) use temporal information
a small seed set of phrases that are highly indicative
as well as co-occurrence to score the related terms to
of bullying. Then the algorithm finds other bullying
expand the query. Mahendiran et al. propose a method
terms by extrapolating from these expert annotations. In
based on probabilistic soft logic to grow a vocabulary using
other words, our algorithm detects cyberbullying from key-
multiple indicators (e.g., social network, demographics,
phrase indicators. We refer to our proposed method as the
and time).
participant-vocabulary consistency (PVC) model; It seeks
a consistent parameter setting for all users and key phrases
in the data that characterizes the tendency of each user to 3. Proposed Method
harass or to be harassed and the tendency of a key phrase
To model the cyberbullying problem, for each user ui ,
to be indicative of harassment. The learning algorithm
we assign a bully score bi and a victim score vi . The
optimizes the parameters to minimize their disagreement
bully score measures how much a user tends to bully
with the training data which are highly indicative bullying
others; likewise, victim score indicates how much a user
phrases in messages between specific users.
tends to be bullied by other users. For each feature wk ,
A study by ditchthelabel.org (2013) found that Facebook, we associate a feature-indicator score that represents how
YouTube, Twitter, and Ask.fm are the platforms that have much the feature is an indicator of a bullying interaction.
47
Cyberbullying Identification Using Participant-Vocabulary Consistency
defined lift between the overall average word score and the
Table 1. Identified bullying bigrams detected by participant-
average target word score (0.825). DQE produces a small
vocabulary consistency from Twitter and Ask.fm data sets.
lift (0.0099). Co-occurrence has no apparent lift.
Data Set Selected High-Scoring Words
By manually examining the 1,000 highest scoring words,
Twitter sh*tstain arisew, c*nt lying, w*gger, we find many seemingly valid bullying words. These de-
commi f*ggot, sp*nkbucket lowlife, tected curse words include sexual, sexist, racist, and LGBT
f*cking nutter, blackowned whitetrash, (lesbian, gay, bisexual, and transgender) slurs. Table 1 lists
monster hatchling, f*ggot dumb*ss, some of these high-scoring words from our experiments.
*ssface mcb*ober, ignorant *sshat
The PVC algorithm also computes bully and victim scores
Ask.fm total d*ck, blaky, ilysm n*gger, fat sl*t,
for users. By studying the profiles of highly scored
pathetic waste, loose p*ssy, c*cky b*stard,
victims in Ask.fm, we noticed that some of these users do
wifi b*tch, que*n c*nt, stupid hoee,
appear to be bullied. This happens in Twitter as well, in
sleep p*ssy, worthless sh*t, ilysm n*gger
which some detected high scoring users are often using
offensive language in their tweets. Fig. 3 shows some
41,833 users and 286,767 question-answer pairs. bullying comments to an Ask.fm user and her responses,
We compare our method with two baselines. The first all of which contain offensive language and seem highly
is co-occurrence. All words or bigrams that occur in inflammatory.
the same tweet as any seed word are given the score 1,
and all other words have score 0. The second baseline is
dynamic query expansion (DQE) (Ramakrishnan et al.,
2014). DQE extracts messages containing seed words, then
computes the document frequency score of each word. It
then iterates selection of the k highest-scoring keywords
and re-extraction of relevant messages until it reaches a
stable state.
We evaluate performance by considering held out words
from our full curse word dictionary as the relevant target
words. We measure the true positive rate and the false posi-
tive rate, computing the receiver order characteristic (ROC)
curve for each compared method. Fig. 2 contains the ROC
curves for both Twitter and Ask.fm. Co-occurrence only
indicates whether words co-occur or not, so it forms a
single point in the ROC space. Our PVC model and DQE
compute real-valued scores for words and generate curves.
DQE produces very high precision, but does not recover
many of the target words. However, co-occurrence detects Figure 3. Example of an Ask.fm conversation containing possible
a high proportion of the target words, but at the cost of bullying and heavy usage of offensive language.
also recovering a large fraction of non-target words. PVC
is able to recover a much higher proportion of the target 5. Conclusion
words comparing DQE. PVC enables a good compromise
In this paper, we proposed the participant-vocabulary con-
between recall and precision.
sistency method to simultaneously discover victims, insti-
We also compute the average score of target words, non- gators, and vocabulary of words indicates bullying. Start-
target words, and all of the words. If the algorithm ing with seed dictionary of high-precision bullying indica-
succeeds, the average target-word score should be higher tors, we optimize an objective function that seeks consis-
than the overall average. For both Twitter and Ask.fm, tency between the scores of the participants in each interac-
our proposed PVC model can capture target words much tion and the scores of the language use. For evaluation, we
better than baselines. We measured how many standard perform our experiments on data from Twitter and Ask.fm,
deviations the average target-word score is above the over- services known to contain high frequencies of bullying.
all average. For Twitter, PVC provides a lift of around Our experiments indicate that our method can successfully
1.5 standard deviations over the overall average, while detect new bullying vocabulary. We are currently working
DQE only produces a lift of 0.242. We also observe the on creating a more formal probabilistic model for bullying
same behavior for Ask.fm: PVC learns scores that have a to robustly incorporate noise and uncertainty.
49
Cyberbullying Identification Using Participant-Vocabulary Consistency