If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Site reliability engineer: What I do and how much I make

Ruth Grace Wong, a Site Reliability Engineer at Pinterest, ensures the platform's smooth operation. Her role includes proactive and reactive tasks, enhancing system reliability, and assisting other engineers. She emphasizes the importance of coding skills, perseverance, and learning without always being an expert.

Want to join the conversation?

Video transcript

My name is Ruth Grace Wong, I'm 24-years-old, I'm a Site Reliability Engineer at Pinterest on the Core Site Reliability Engineering Team, and my salary is approximately $120,000. Pinterest makes a website and mobile apps, and they allow people to collect things that they find on the internet, ideas that they wanna put into their own lives, and save them all in one place as inspiration. Core site reliability is responsible for the overall reliability of Pinterest. We're always trying to be proactive to help improve the experience for engineers. We've got about 400 engineers at Pinterest and our goal is to help them make their services more reliable, and we also have 150 million users of Pinterest, and so we want Pinterest to work well for them. For site reliability engineering, we have two categories of responsibilities. There's proactive and there's reactive. So, reactive work would be looking at operations requests if somebody needs help with someone, and then proactive would be improving the system so that they're more reliable and easier for people to use. I think there are two main skills that are good to have. The first one is learning to be okay with not feeling like you aren't the expert and you might not ever be the expert, but kind of diving in and doing your best anyways. And also, knowing how to code is also really good because then you can automate what you're doing and improve the system. Problems are so complex that it's important to also persevere. Sometimes I'll get stuck on something and I'll try to work on something else and then come back to it. It's also really important to ask the people around you for help because often there's that one senior engineer who knows all these details and they're not written down. I guess the most frustrating thing, or difficult thing about this job is that sometimes the problem that you're trying to fix is just so deep, so complicated, you try all these, they don't work, and it turns out it something that doesn't even make sense. Sean, who works on Kafka here at Pinterest, he once had this problem where certain machines would run fine and then other machines would not, and he figured out that it was because the way they were named, certain numbers on the end of the machine name were not working because it was being converted to Octal. And that's just an example of a problem that's so crazy that you would never be able to figure out unless you met somebody that had figured it out before.