Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Microsoft Senior Software Engineer Machine Learning Infrastructure 
United States, Washington 
222788430

16.07.2024

and data scientist.

at Azureof requests per day. You will be joining the Inference team thatworks directly with OpenAI to host models efficiently on Azurelarge language models (LLMS) infrastructure,optimizing large language models and diffusion models for inference at high scale and low latency.


Required Qualifications:

  • Bachelor's Degree in Computer Science, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
    • OR equivalent experience.
  • Experience in distributed computing and architecture, and/or developing and operating high scale, reliable online services.

Preferred Qualifications:

  • C/C++ and/or Python development
  • 6+ years of software development experience
  • Practical experience working on real-world applications that use CUDA, experience in optimizing CUDA kernels for performance,
  • Experience in low-level program behavior, including performance and memory usage, proficiency using profiling tools such as NVIDIA Visual Profiler, nvprof, and NVIDIA Nsight Compute
  • Understanding of parallel algorithms for communication between GPUs, familiarity with related libraries and frameworks such as DeepSpeed, PyTorch Distributed, Horovod, Megatron, MSCCL, NCCL.
  • Knowledge and experience in OSS, Docker, Kubernetes, Python, GOLANG, Rust programming languages
  • Knowledge of Large Language Models/Diffusion model architectures e.g. GPT and Stable Diffusion
  • Experience in developing low latency systems.
  • Experience working in a geo-distributed team

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications for the role until July 20, 2024.

Responsibilities
  • Engage directly with key partners to understand and implement complex inferencing capabilities forstate-of-the-artLarge Language Models and Diffusion models.
  • Work withcutting edgehardware stacks and a fast-moving software stack to deliverbestof class inference andoptimalcost.
  • Anticipate,identify, assess, track, and mitigate project risks and issues in a fast-paced start up like environment.
  • Motivatedto build constructive and effective relationships and solve problems collaboratively.
  • Supportproduction inferenceSLAsfor core AI scenarios onone of the largest GPU fleets in theworld
  • Embody our and