What you'll be doing:
- Numerics: research of low-bit number representations and their effect on neural network inference and training accuracy. This includes requirements by the existing state of art neural networks, as well as co-design of future neural network architectures and optimizers.
- System performance: research various parallelization approaches for large neural networks (both training and inference), communication patterns and their performance limiters on large GPU systems, communication and computation overlap.
What we need to see:
- MS or PhD degree in computer science, computer engineering or a related field. Equivalent experience in some of the areas listed below can substitute for an advanced degree.
- At least 3+ years of relevant industry experience.
- Familiarity with state of art neural network architectures, optimizers.
- Experience with modern DL training frameworks and/or inference engines.
- Fluency in Python, C++, or ideally both.
- For numerics focused candidates:
- Experience with training neural networks (LLMs and multi-modal models are of particular interest), exploring model architectures and optimizers.
- Experience with quantization of neural networks, numerical analysis, number representations and computer arithmetic.
- For the systems focused candidates:
- Experience with parallel programming and performance analysis, collective communications.
- Background in computer architecture. Experience with GPU computing and CUDA is not required but a big plus.
Intelligent machines powered by AI computers that can learn, reason and interact with people are no longer science fiction. Today, a self-driving car powered by AI can meander through a country road at night and find its way. An AI-powered robot can learn motor skills through trial and error. This is truly an extraordinary time. The era of AI has begun.
You will also be eligible for equity and .