1. 02:10 16 Jun 2014


    The following is an excerpt from my dissertation. It felt sad burying all my gratitude in a gigantic research document that not many people will read. To all the amazing people in my life that aren’t mentioned here: this was written somewhat (very) hastily and I’m (probably) grateful to you as well :)

    Read More

  2. 12:42 12 Mar 2014

    Tags: edavideos

    Learn Exploratory Data Analysis

    My friends Moira, Dean, and Solomon - all members of Facebook’s Data Science team - worked with Udacity to create a fantastic course on exploratory data analysis. If you’re new to R/ggplot, or just want to hear about how experts think about visualizing and exploring data, this would be a great place to start. I really can’t recommend these instructors highly enough.

    They asked several members of our team to talk about an EDA project they worked on and these interviews are included in the course. You can check out me talking about visualizing the sentiment from posts about NFL teams here.  I talk about using splines versus more flexible models for time series data and the bias-variance tradeoff.

  3. 00:10 11 Mar 2014

    Tags: conferencesexperiments

    Tutorial: Online experiments for computational social science

    Eytan Bakshy and I are giving a tutorial this year at ICWSM (6/1 in Ann Arbor).  Sign up and learn some awesome stuff!

    Registration for ICWSM isn’t open yet, but you can sign up for a reminder when it goes live. We’ll only email you one time.

    Taught by two researchers on the Facebook Data Science team, this tutorial teaches attendees how to design, plan, implement, and analyze online experiments. First, we review basic concepts in causal inference and motivate the need for experiments. Then we will discuss basic statistical tools to help plan experiments: exploratory analysis, power calculations, and the use of simulation in R.  We then discuss statistical methods to estimate causal quantities of interest and construct appropriate confidence intervals. Particular attention will be given to scalable methods suitable for “big data”, including working with weighted data and clustered bootstrapping. We then discuss how to design and implement online experiments using PlanOut, an open-source toolkit for advanced online experimentation used at Facebook.  We will show how basic “A/B tests”, within-subjects designs, as well as more sophisticated experiments can be implemented.  We demonstrate how experimental designs from social computing literature can be implemented, and also review in detail two very large field experiments conducted at Facebook using PlanOut.  Finally, we will discuss issues with logging and common errors in the deployment and analysis of experiments. Attendees will be given code examples and participate in the planning, implementation, and analysis of a Web application using Python, PlanOut, and R.