Science paper on Social Influence Bias

08 Aug 2013

Exciting news! The paper I co-authored with Lev Muchnik and my advisor, Sinan Aral. Social Influence Bias: A Randomized Experiment was published in the August 9th edition of Science. Since many of you will read about the findings of the paper as interpreted by journalists, I thought it would be useful to give my own TL;DR version here. Plus a bonus plot that was not useful to include in the paper:

The first order result is that we have evidence that seeing prior ratings has a causal effect on rating behavior. When you rate things online, you are often exposed to others’ ratings (either aggregated or listed individually). It turns out that this does impact rating decisions and creates path dependence in ratings. The implication is that high or low ratings do not necessarily imply high or low quality because of social influence bias in rating systems.

This finding requires a randomized experiment because items with current high ratings could simply be high quality, so if we see future high ratings it may not be because of any bias at all. We exogenously manipulate the initial ratings (“up-treatment” and “down-treatment”) in our design order to isolate a pure influence effect.

I believe our study innovates beyond earlier work in this field (such as the excellent music lab experiments by Matt Salganik, Peter Dodds, and Duncan Watts) in at least two key ways.

First, because we examine both up- and down-treatments, we can characterize an interesting asymmetry in social influence. Up-treatment works exactly as we expect, creating a 25% increase in final scores of comments. However, the effect of down-treatment is more nuanced. People seem to respond differently to negative ratings, either by correcting them or herding on them. Combining these two effects cancels out in aggregate treatment effects in long run ratings.

Second, repeated observation of the same users over time combined with the fact that users are able to either up-vote, down-vote, or abstain in response to treatments allows us to decompose treatment effects into selective turnout and opinion change. We find little evidence that our treatments inspire different types of people to rate (i.e. up-treatment causes only positive people to vote), but we do find evidence that our treatments change the proportion of up-votes used among subgroups in our sample. We conclude that while our manipulations do draw attention to comments and inspire more voting, they don’t do it any systematic way that we can identify and opinion change is at least one significant component of the effects we observe.

Distinguishing between these explanations was a sticking point of the reviewers and a major technical challenge in the paper. If you’re interested in more detail, I highly recommend reading the supplementary materials to see what I had been working on all Spring.

There are a bundle of other interesting and suggestive results in the paper and the supplementary materials, so I encourage you to read both (and pass any questions along to us!). Many thanks to my co-authors Lev and Sinan for being so awesome to collaborate with, and to the anonymous reviewers at Science who made the paper much better through their thoughtful criticism.