Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Microsoft Principal Software Engineer 
United States, Washington 
432165575

Today

Qualifications

Required Qualifications:

  • Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

o OR equivalent experience.

  • Practical experience working on real-world applications that use CUDA, experience in optimizing CUDA kernels for performance.
  • Understanding of parallel algorithms for communication between GPUs, familiarity with related libraries and frameworks such as DeepSpeed, PyTorch Distributed, Horovod, Megatron, MSCCL, NCCL.


Preferred/Additional Qualifications:

  • Experience in low-level program behavior, including performance and memory usage, proficiency using profiling tools such as NVIDIA Visual Profiler, nvprof, and NVIDIA Nsight Compute
  • Background using C/C++ and/or Python development
  • Knowledge and experience in OSS, Docker, Kubernetes, Python, GOLANG, Rust programming languages
  • Knowledge of LLM/Diffusion model architectures e.g. GPT and Stable Diffusion
  • Experience in distributed computing and architecture.
  • Experience in developing low latency systems.
  • Experience in developing and operating high scale, reliable online services.
  • Effective communication, collaboration skills and a great team player
  • Experience working in a geo-distributed team

Certain roles may be eligible for benefits and other compensation.Find additional benefits and pay information here:


Responsibilities
  • Engage directly with key partners to understand and implement complex inferencing capabilities for state-of-the-art LLMs and Diffusion models.
  • Work with cutting edge hardware stacks and a fast-moving software stack to deliver best of class inference and optimal cost.
  • Anticipate, identify, assess, track, and mitigate project risks and issues in a fast-paced start up like environment.
  • Motivated to build constructive and effective relationships and solve problems collaboratively.
  • Support production inference SLAs for core AI scenarios on one of the largest GPU fleets in the world

Other:

  • Embody our and