Job Description

P-1285

About This Role


As a staff software engineer for GenAI Performance and Kernel, you will own the design, implementation, optimization, and correctness of the high-performance GPU kernels powering our GenAI inference stack. You will lead development of highly-tuned, low-level compute paths, manage trade-offs between hardware efficiency and generality, and mentor others in kernel-level performance engineering. You will work closely with ML researchers, systems engineers, and product teams to push the state-of-the-art in inference performance at scale.


What You Will Do

  • Lead the design, implementation, benchmarking, and maintenance of core compute kernels (e.g. attention, MLP, softmax, layernorm, memory management) optimized for various hardware backends (GPU, accelerators)

  • Drive the performance roadmap for kernel-level improvements: vectorization, tensorization, tiling, fusion, mixed precision...
  • Ready to Apply?

    Take the next step in your AI career. Submit your application to Databricks today.

    Submit Application