Required Qualifications:
- Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding inC, C++ or Python
- OR equivalent work experience.
- 1+ yearsexperience working on real-world applications that use CUDA, including experience optimizing CUDA kernels for performance.
- 1+ yearsexperience with parallel algorithms for communication between GPUs and familiarity with related libraries and frameworks (i.e. DeepSpeed, PyTorch Distributed, Horovod, Megatron, MSCCL, NCCL).
Other Qualifications:
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred/Additional Qualifications:
- Experience in low-level program behavior, including performance and memory usage, proficiency using profiling tools such as NVIDIA Visual Profiler, nvprof, and NVIDIA Nsight Compute
- Experience with OSS, Docker, Kubernetes
- Experience with Python, GOLANG, Rust programming languages
- Knowledge of LLM/Diffusion model architectures (e.g. GPT, Stable Diffusion)
- Experience in distributed computing and architecture.
- Experience in developing low latency systems.
- Experience in developing and operating high scale, reliable online services.
- Experience working in a geo-distributed team
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:Microsoft will accept applications for the role until July 04, 2024.