Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

Amazon Sr Software Engineer- AI/ML AWS Neuron Distributed Training 
United States, Washington, Seattle 
41766129

20.11.2024
DESCRIPTION

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
The ML Apps team works side by side with chip architects, compiler engineers and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large models using Python is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.Key job responsibilitiesDiverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSWork/Life Balance
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

BASIC QUALIFICATIONS

- - 5+ years of non-internship professional software development experience
- - 5+ years of programming with at least one software programming language experience
- - 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- - 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- - Experience as a mentor, tech lead or leading an engineering team


PREFERRED QUALIFICATIONS

- - Bachelor's degree in computer science or equivalent
- - Machine Learning knowledge in frameworks and end to end model training.