Machine Learning Engineer (Inference)

GW132
  • $200,000-$350,000
  • San Francisco, CA
  • Permanent

About the job


Machine Learning Engineer (Inference) 


We are seeking an Inference focussed Machine Learning Engineer to join a Stanford spin out scale up building a foundational infrastructure layer for AI inference.


The team were founded on the back of a successful exit, with the core of the previous founding team creating their new venture. Their aim is to dramatically improve inference efficiency across the stack, tackling custom compiler, kernel & distributed orchestration bottlenecks.


They are hiring across the stack, with a particular focussing on accelerating inference performance with cutting edge research and engineering techniques. You will work on lower level systems, building and optimizing serving stacks. 


We are seeking a Machine Learning Engineer (Inference) with:


  • A focus on inference and serving environments
  • Experience building and optimizing inference stacks
  • Exposure to inference related tools and frameworks which could include vLLM, TensorRT, SGLang or Pytorch


Location: San Francisco or the Bay on a hybrid basis


Compensation: Competitive base salary + meaningful equity % + potential sign on bonus


Tom Parker Senior Software Systems & HPC Recruiter

Apply for this role