Job Description:Role and responsibilities for this role includes:• Design and develop SW features for AI frameworks - both HW-agnostic and HW-aware, especially in PyTorch device plugin and ML kernel development.
• Enhance and extend the Deep learning training and Inference capabilities in the Software stack.
• Identifying optimization opportunities in the software stack to enhance performance of Deep learning workloads
• Working with Open-source community to support Intel GPUs and upstream code.
Qualifications:- BTech, MS/MTech & PhD in CS, ECE or related fields with an overall experience of 6 to 12 years.
Proficient in Advanced C++ (C++ 14/17) and Intermediate skills of Python and GPU programming. - Experience in developing machine learning kernels such as GEMM, Convolution, Flash attention, MoE etc.
- Hands on experience in one of the frameworks such as PyTorch, Tensorflow or JAX.
- Practical knowledge working on deep learning models such as NLPs & LLMs
- Ability to debug complex issues in multi layered SW systems. Understanding of SW integration in large open-source frameworks.
- Strong understanding of computer architecture and HW-SW optimization techniques.
- Experience in working on frameworks/platforms that have gone to production.
- Effective communication skills and experience with working in a cross-geo teams.Preferable
- Experience in developing and integrating CUTLASS or Triton based kernels
- Knowledge of compiler algorithms for heterogeneous system and Fuser optimizations.
Experienced HireShift 1 (India)India, Bangalore
This role will require an on-site presence. * Job posting details (such as work model, location or time type) are subject to change.