Member of Technical Staff

Full-time·San Francisco·On-site·$150K–$250K·1%–2%

Wafer's mission is to maximize intelligence per watt, by building AI that optimizes AI infrastructure. Our journey starts with GPU kernels, but will eventually expand into every corner of ML systems and AI infrastructure. We're a small team (4 people) backed by Fifty Years, Y Combinator, Jeff Dean, and Woj Zaremba (co-founder of OpenAI), and we're looking for engineers who want to work at the cutting edge of ML systems, AI agents, and GPU kernels. We believe Wafer will be part of the few truly generational companies. And importantly, one that does a lot of good for the world in the process. Here are four reasons this is an opportunity worth considering, and what makes us different: 1. Compute is the bottleneck for everything that matters. Our mission is to maximize intelligence per watt. Models will keep scaling, agents will run longer, and trillions will pour into compute as every AI lab races to automate every part of the economy. The infrastructure powering these automations will be the foundational layer of the century ahead. 2. We're going after the hardest problems in AI infrastructure. Squeezing maximum intelligence out of every watt consumed is one of the most technically challenging and consequential problems in the world. We don't shy away from this. 3. The team is and always will be deeply technical. Steven shipped infrastructure at Google and Two Sigma. Emilio published at NeurIPS and did AI research at Argonne. John built data center automation at AWS. Ian previously founded a YC startup (Willow). We've all seen what technical excellence looks like, and we're building the highest-caliber ML systems team in the world. 4. It's early, and the surface area is enormous. You'll make decisions that define our product, technical direction, and engineering culture for years. The abstractions you choose and the systems you build become foundational. 5. We're ambitious, and we work hard (worth being upfront about). The only outcome we care about is one where Wafer runs and optimizes every AI workload on the planet. That requires a lot of sacrifice and dedication, but we believe it's worth it. We're all in pursuit of building something generational.

What You'll Do

Build and improve our framework for GPU kernel optimization (multi-turn tool use, state management, reward signals)
Develop integrations with GPU profilers and compiler toolchains
Design the architecture for remote GPU execution across cloud GPUs
Work on trace analysis systems that help the agent diagnose performance bottlenecks
Ship features that engineers use daily, and that optimize infrastructure that runs the world’s AI (PyTorch, vLLM, NVIDIA, AMD, etc.)

What We Look For

You’re a strong fit if you:

Have deep technical intuition and can learn new domains quickly
Are comfortable working across the stack: Python, C++, TypeScript, CUDA
Can ship production code fast while maintaining quality
Want to work on some of the most interesting AI infra problems at a small company with no bullshit + ship fast culture

Very nice to have:

GPU programming experience (CUDA, HIP, Triton)
Experience with profiling tools or compiler internals
Background in AI/ML research or agent systems
Publications or open-source work in relevant areas

Member of Technical Staff

Apply for this Position