Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

Amazon Sr Software Engineer- AI/ML AWS Neuron Apps 
United States, Washington, Seattle 
77471058

27.04.2025
DESCRIPTION

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
The ML Apps team works side by side with chip architects, compiler engineers and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large models using Python is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.Key job responsibilities
This role will help lead the efforts building distributed inference support into Pytorch, Tensorflow using XLA and the Neuron compiler and runtime stacks. This role will help tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and the TRn1 , Inf1 servers. Strong software development using C++/Python and ML knowledge are both critical to this role.A day in the life
Work/Life Balance
Mentorship & Career Growth

BASIC QUALIFICATIONS

- 5+ years of programming using a modern programming language such as Java, C++, or C#, including object-oriented design experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.


PREFERRED QUALIFICATIONS

- Bachelor's degree in computer science or equivalent