Infrastructure Engineer, Observability


San Francisco, CA, USA

Full time

Sep 29

This job is no longer accepting applications.

Build a more reliable Stripe.

Stripe’s infrastructure powers businesses all over the world. We process payments, run marketplaces, detect fraud, help entrepreneurs start an internet business from anywhere in the world, build world-class developer-friendly APIs, and more. If you’re an infrastructure engineer here, you’ll get to build the systems that power our products.The success of every single API request we process is critical to everyone involved! We can’t go down because our users’ businesses depend on us.

You’ll be on a team that maintains a product we provide to the rest of engineering, like storage or message queueing. You’ll make decisions with a significant impact on Stripe. There is a lot of work to do to make Stripe engineers’ work easier and our platform even more reliable than it is today, and we’d love for you to be part of it. We’re close to the people using our systems, so we constantly get feedback that we can use to make them better. The team will help all of engineering—from the CTO to our interns—by identifying, creating and automating engineering practices, processes and software that will be leveraged by the whole organization to improve reliability.

You’ll work with other infrastructure engineers as well as product engineers who use the systems you’re building.

We’re looking for people with a strong background (or interest!) in systems. We’d love to hear from you whether you’re a seasoned systems developer, or whether you’ve just learned you might like working with Many of our infrastructure engineers work remotely, and we’d be happy to talk to you about the possibility of working remote.

You will:

  • Develop the core interfaces and infrastructure used by all of Stripe’s engineering teams
  • Design automated fault detection infrastructure and systems that run in 24x7 mode with yearly downtime measured in minutes
  • Scale the observability infrastructure to support hundreds of terabytes of logs and hundreds of billions of metric data points daily
  • Debug issues and solve distributed systems challenges across services and levels of the stack Build best-in-class developer tooling for people using your infrastructure

We’re looking for someone who:

  • Are able to write high quality code in a programming language (e.g. Ruby, Scala, Go)
  • Think about systems – their edge cases, failure modes and life cycles
  • Are comfortable operating infrastructure systems at scale
  • Wears every 9 of uptime as a badge of honor
  • Can debug complex problems across the whole stack
  • Focus on the needs of your users
  • Thrive in a high autonomy environment surrounded by unsolved problems
  • Worked with data pipelines moving around large sets of data, quickly
  • Managed an on-premise logging installation (e.g. Splunk, ELK),time series metric database (e.g. - Prometheus, InfluxDB, M3DB), or distributed tracing infrastructure
  • Familiarity with writing eBPF filters and debugging performance problems

What’s it like to work at Stripe?

Stripe is helping the internet fulfill its potential as a platform for economic progress by building software tools that accelerate global economic access and technological development. Stripe makes it easy to start, run and scale an internet business from anywhere in the world.

Stripe is, at its heart, an engineering company. To provide a missing pillar of core internet infrastructure, we hire people with a broad set of technical skills (and from a wide variety of backgrounds) who are ready to take on some of the most challenging problems in the industry – from reliably handling 100M API requests per day, to building adaptive machine learning as a result of years of data science and infrastructure work, and enabling entrepreneurs worldwide to start a global internet business.

We look at Stripe as a constant work in progress and the same is true of our people; for all of us, we believe the best is yet to come. We’re here to support each other in our curiosity and creativity – which we pursue through thoughtful discussion and knowledge-sharing among a diverse set of peers and colleagues.

We encourage all engineers to transition teams once every year and a half and also take on short-term projects with other teams across Stripe. This enables engineers to learn how different parts of Stripe work while also establishing stronger ties and cross-pollination between groups.

We contribute to existing open-source projects and the people working on them, and we release several tools as open-source.

We want to work in a company of warm, inclusive people who treat their colleagues exceptionally well. The kind of people who are committed to going out of their way to help other Stripes in the short-term and pushing them to improve over the long-term (by helping them to get better at what they do).

We’re a highly cross-functional organization and view that as part of the fun: we design our space to encourage as much collaboration as possible. We have long tables in the kitchen for a reason (to enable everyone to meet new people and learn from them). We also have a culture of transparency that we carry through to email communication, ensuring that Stripes all around the world have the information they need to make good local decisions.

In both our products and our people, we aim to reflect, represent and advocate for all of our users, globally. Our users transcend geography, culture and language; what we share, collectively, is a drive to create a fairer, more economically interconnected world.

You must be logged in to to apply to this job.


Your application has been successfully submitted.

Please fix the errors below and resubmit.

Something went wrong. Please try again later or contact us.

Personal Information


View resume