Required Qualifications:
- Bachelor's Degree in Computer Science, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
- Experience in distributed computing and architecture, and/or developing and operating high scale, reliable online services.
Preferred Qualifications:
- C/C++ and/or Python development
- 6+ years of software development experience
- Practical experience working on real-world applications that use CUDA, experience in optimizing CUDA kernels for performance,
- Experience in low-level program behavior, including performance and memory usage, proficiency using profiling tools such as NVIDIA Visual Profiler, nvprof, and NVIDIA Nsight Compute
- Understanding of parallel algorithms for communication between GPUs, familiarity with related libraries and frameworks such as DeepSpeed, PyTorch Distributed, Horovod, Megatron, MSCCL, NCCL.
- Knowledge and experience in OSS, Docker, Kubernetes, Python, GOLANG, Rust programming languages
- Knowledge of Large Language Models/Diffusion model architectures e.g. GPT and Stable Diffusion
- Experience in developing low latency systems.
- Experience working in a geo-distributed team
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
Microsoft will accept applications for the role until July 20, 2024.