Replicate

Who we are

We're a bunch of hackers, engineers, researchers, and artists.

We obsess about the details of API design and the right words for things. We're defining how AI works so we'd better get it right.

We make fast and reliable infrastructure. That's what a good infrastructure product is. We're not afraid to build things from scratch to make it the fastest.

We use AI for work. We use AI for play. We find unexplored parts of the map and create new techniques ourselves. We open-source it all.

We build in public, for the community. We want AI to work like open-source software so everyone benefits from it.

We're led by engineers. We're all doers. There's no bullshit here.

We've worked at places like Apple, Docker, GitHub, Heroku, and Spotify. We've created technologies like Docker Compose and OpenAPI.

We're here to build a big company. We're ambitious and hard-working. We're not just here to build nice things.

Join us

If this is you, we want to work with you.

Data engineer

You’re a generalist data and analytics expert who builds data infrastructure at scale. You act like an owner and have a desire to lead. You’ve likely been a data engineer at traditional companies but you’re ready to be the first data hire at a startup.

Replicate is a complex business and we need solid data infrastructure to guide it. You’ll own this infrastructure.

We’re looking for the right person, not just someone who checks boxes, so you don’t need to satisfy all of these things. But you probably have some of these qualities:

Experience building from scratch. You’ve set up data stacks and pipelines, not just worked with an existing system.
Expert at SQL and other common analytics tools. You can turn a question into a query faster and more reliably than GPT-4.
You think like a software engineer. You put things in GitHub, use continuous integration, give things good names, make it consistent with how we do things in the product. We want to create data systems that integrate with the rest of our systems, not their own silo with a different culture.
Fast and scrappy. You know how to get 80% of the value with 20% of the effort.
Experience with usage-based businesses. They’re much more complicated.
You like helping people. You’re not going to live in a cave, you’re going to be building the tools that make our finance, product, infrastructure, and growth teams successful. You’re not just going to give them numbers, you’re also going to help them tell a story about those numbers.

You’ll be working from our San Francisco office for this role.

Email us: [email protected]

Infrastructure engineer

You’re an infrastructure engineer who has experience building and operating things at scale.

We’re building the fastest way to deploy machine learning models. When somebody pushes a model to Replicate, we optimize it, pick the right GPU, deploy it on a cluster, automatically scale it from zero to n, and so on. All the hard stuff that companies doing ML struggle with.

Instead of being an infrastructure engineer at one of those companies, you could work for us and force-multiply yourself across thousands of companies.

We’re looking for the right person, not just someone who checks boxes, so you don’t need to satisfy all of these things. But, you might have some of these qualities:

Experience building and scaling infrastructure at huge scale.
Experience in the ins and outs of Kubernetes.
Experience with serverless architectures.
Excellent communication skills. We think most of being a programmer is not programming. We want you to be able to communicate complex topics clearly, write down your thinking, write good docs, etc.

Email us: [email protected]

Machine learning performance engineer

You’re an engineer who lives and breathes high-performance machine learning. You have a deep understanding of how to make AI models run faster and more efficiently, and you’re excited about pushing the boundaries of what’s possible with current hardware.

At Replicate, we’re building the fastest way to deploy machine learning models. Your role will be crucial in optimizing the performance of the diverse range of models we host, ensuring they run as efficiently as possible on our infrastructure.

We’re looking for the right person, not just someone who checks boxes, so you don’t need to satisfy all of these things. But, you might have some of these qualities:

Strong applied engineering skills. You’ve deployed machine learning models in scaled-up production environments and know the challenges that come with it.
Deep expertise in CUDA programming and GPU acceleration techniques. You can write custom kernels in your sleep.
Proficiency in C++ and Python. You’re comfortable diving deep into low-level optimizations and high-level model architectures alike.
Extensive experience with deep learning frameworks like Torch or JAX. You know their strengths, weaknesses, and how to squeeze every ounce of performance out of them.
A solid grasp of machine learning algorithms. Especially with a focus on diffusion models, large language models, or other generative AI techniques.
Familiarity with model quantization techniques, distillation, model pruning, etc. You understand the tradeoffs and know when to apply which technique.
You stay up-to-date with the latest developments in ML performance optimization. When a new technique drops, you’re already thinking about how to implement it.

You might be particularly good for this job if:

You’ve written custom CUDA kernels to significantly improve model latency and can share war stories about the process.
You can discuss the tradeoffs between fp8 and int8 quantization in depth, and have applied either (or both) to whatever hot new model dropped last week.
You get excited about diving into academic papers on ML optimization techniques and turning them into practical, production-ready code.

Email us: [email protected]

Product engineer

You’re a generalist engineer, leaning towards backend/infrastructure. You’ve probably worked on developer tools or APIs before, and have a refined sense of what makes an excellent developer tool.

We have this website (currently React + Django), an open source CLI (Go + Python), and an API (Go + Kubernetes). The website and the CLI are probably where you’ll be spending most of your time, but you might be touching any part of the stack, as well as all the other things that happen in the early stage of a company (talking to users, doing support, etc).

We don’t mind what particular skills you already have. We figure you can pick up something new quickly.

We’re looking for the right person, not just someone who checks boxes, so you don’t need to satisfy all of these things. But, you might have some of these qualities:

Extensive experience working on web products or developer tools.
Excellent communication skills. We think most of being a programmer is not programming. We want you to be able to communicate complex topics clearly, write down your thinking, write good docs, etc.
Experience working with infrastructure teams.
You don’t need to know anything about machine learning, but it might be handy.

Email us: [email protected]

We’re bringing AI to every software developer.

Our investors

Who we are

Join us

Data engineer

Infrastructure engineer

Machine learning performance engineer

Product engineer