Director of Data Science, Facebook

David is the Director of Data Science for Infrastructure at Facebook, the world’s largest social media platform. His main role is to supervise a team of data scientists who plan, optimize, and understand Facebook’s infrastructure, such as data centers, software systems, and networking equipment. Like most back-end careers in tech, David tells us that when Facebook is performing well “you’re not seeing us at all.”

Transcript

Hi, my name is David Clausen and I'm the director of data science at Facebook for infrastructure. My main responsibilities are supporting a team of data scientists that focus on helping to plan, optimize, and understand Facebook's infrastructure, data centers, networking equipment, software systems, to overall help the site run and grow. If Facebook is performing well you're not seeing us at all. Our goals are mostly kind of long term planning and optimization, so on the day to day basis you're probably not gonna see the interactions as a result of our work, it's a lot studies, analysis, experimentation, modeling, that plays out over the course of many years. So this is a problem that some people that I work with executed that we supported. The idea is Facebook is a large company, it's got thousands of software developers that are continually writing code on a regular basis, and they need to deploy that to a fleet of many servers and actually update what is Facebook and what are the systems that run Facebook on a regular basis. In order to do that we need to ship code to the billions of people that access the site multiple times a day, and that is a way of allowing us to be efficient and move fast with our software development resources, so if you have an idea you can go work with it, execute it, and try it out and see how it works. On the other hand, we want to make sure that we don't break the site with a bad bit of code or do something that uses a lot of CPU or RAM or storage, some of the core, scarce resources that we're trying to plan and optimize for. And so we worked on a project that basically built in regression detection, which is a modeling technique that tries to predict the trajectory of a time series for different metrics we care about, in this case the amount of CPU utilization that our website generated and to very quickly detect when that deviates from what's expected and then send alerts to the people that did the offending change and quickly get it reversed. And so this is a way of in an operationally efficient way using modern machine learning techniques, allowing software developers to do their job efficiently and effectively and fast.

Download transcript