Sean J. Taylor

Putting the Magic in Data Science

Here are slides from my talk at QCon last week:

Here’s my talk (which was not recorded) in a nutshell:

  • If we want to be valuable as data scientists, we should aspire to create as many “how did they do that?” moments as we can.  I call these “hoverboards.”  If we just count things, we are being terrible magicians. We probably don’t want to end up being the accountants of the 21st century.
  • Furthermore, these moments of magic should have impact – they should cause strategic or tactical decisions that people make to change.
  • I argue that magic in data science often comes from combining various “tricks” in novel ways. I describe four common tricks we use at Facebook, as well as a grab bag of others that I’ve found useful.
  • Tricks alone are not enough.  People have to use the technology you create.  That requires considering what I call Data Science’s last mile problem. How do we make the data inform/change people’s behaviors?
  • I describe four important (non-exhaustive) last mile considerations: reliability, latency/interactivity, simplicity, and interestingness. Many of these are achieved not through data science alone, but by combining data science with tricks from software engineering, design, and computer science.

What is Data Science?

There’s one little side-argument I made that I’d like to cover here. One thing we have trouble agreeing on is what data science actually is. My latest theory is we all work in different parts of technology pipeline (see below), which starts with academics publishing papers at conferences like KDD and NIPS and ends with non-experts getting utility from products and services we create. We’re all probably doing “data science” but it might look a lot different (pure math, problem discovery, design/visualization, engineering) depending on what stage you’re involved in.

image