Understanding and fighting bullying with machine learning

We are a research team of computer scientists and social scientists that conduct research to understand and fight bullying using machine learning algorithms.

Bullying is a serious national health issue. Previous scientific study of bullying used personal surveys in schools, and suffered from small sample size and low temporal resolution. In contrast, our algorithms discover the participants of a bullying episode, their social roles, and their emotional responses from publicly available social media posts. The scientific data we produce can improve the understanding, intervention, and policy-making on bullying.

Click to see an animation of two years' bullying tweets in 40 seconds.

The figure below shows the daily count of bullying traces (tweets about bullying) on Twitter as identified by our algorithm. The spikes are celebrity events. For example, the spike around September 24, 2011 was due to Lady Gaga dedicating a song to bullying victim Jamey Rodemeyer who committed suicide a few days earlier. There is a weekly (7-day) cycle. Our algorithm only identifies a small fraction of bullying traces -- The actual number is much larger.

What are these bullying traces about? The vast majority are responses to an already-happened bullying experience (online or offline). Only 4% are cyber-bullying where a bully attacks a victim online. The pie chart shows who's talking: A reporter (not in the journalist sense) may say "I saw John hit a boy at school today;" An accuser may say "John you're a bully;" A victim often discusses their experience of being bullied; A bully often boasts and occasionally cyber-bullies; A defender defends the victim.

This project is based upon work supported by the National Science Foundation under Grant No. IIS-1216758. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.