Some of our current areas of work where we are actively looking for top researchers are:
- Optimized runtime stacks for foundation model workloads including fine-tuning, inference serving and large-scale data engineering, with a focus on multi-stage tuning including reinforcement learning, inference-time compute, and data preparation needs for complex AI systems.
- Optimizing models to run on multiple accelerators including IBM’s AIU accelerator leveraging compiler optimizations, specialized kernels, libraries and tools.
- Innovative use cases that effectively leverage the infrastructure and models to deliver value
- Pre-training language and multi-modal foundation models working with large scale distributed training procedures, model alignment, and creating specialized pipelines for various tasks including effective LLM-generated data pipelines.
You should have one or more of the following:
- A master’s degree in computer science, AI or related fields from a top institution
- 0-8 years of experience working with modern ML techniques including but not limited to model architectures, data processing, fine-tuning techniques, reinforcement learning, distributed training, inference optimizations
- Experience with big data platforms like Ray and Spark
- Experience working with Pytorch FSDP and HuggingFace libraries
- Programming experience in one of the following: Python, web development technologies
- Growth mindset and a pragmatic attitude
- Peer-reviewed research at top machine learning or systems conferences
- Experience working with pytorch.compile, CUDA, triton kernels, GPU scheduling, memory management
- Experience working with open-source communities