

Share
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
Key job responsibilities
A day in the life
As you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:
Participate in design discussions, code review, and communicate with internal and external stakeholders.Work in a startup-like development environment, where you’re always working on the most important stuff.
- 5+ years of non-internship professional software development experience
- 5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.
- Experience programming with at least one software programming language
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Masters degree in computer science or equivalent
These jobs might be a good fit

Share
Key job responsibilities
You will lead efforts to build distributed training support into PyTorch and JAX using XLA, the Neuron compiler, and runtime stacks. You will optimize models to achieve peak performance and maximize efficiency on AWS custom silicon, including Trainium and Inferentia, as well as Trn2, Trn1, Inf1, and Inf2 servers. Strong software development skills, the ability to deep dive, work effectively within cross-functional teams, and a solid foundation in Machine Learning are critical for success in this role.Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSWork/Life Balance
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
These jobs might be a good fit

Share
The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in cloud. Trainium will deliver the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, Neuron Kernel Interface (NKI) compiler, and runtime that natively integrates into popular ML frameworks, such as PyTorch and TensorFlow.Neuron Kernel Interface (NKI) is a bare-metal language and compiler for directly programming NeuronDevices available on AWS Trn/Inf instances. You can use NKI to develop, optimize and run new operators directly on NeuronCores while making full use of available compute and memory resources.Learn more about Our History:
You have knowledge of resource management, scheduling, code generation, optimization, and instruction architectures including CPU, NPU, GPU and novel forms of compute.Explore the Product:
Work/Life Balance
Mentorship & Career Growth
- 5+ years of engineering team management experience
- 9+ years of working directly within engineering teams experience
- 4+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
- Experience partnering with product or program management teams
- Understanding of compilers (resource management, instruction scheduling, code generation, and compute graph optimization)
- Strong software design fundamentals and excellent system-level coding skills
- M.S. or Ph.D. in Computer Science or related technical field
These jobs might be a good fit

Share
Key job responsibilities
In this role you'll develop, design, maintain, deploy, monitor and support a very important component in the Nitro firmware, while enjoying every step of the journey.About the team
*Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
*Why AWS*Work/Life Balance*Inclusive Team Culture*Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
These jobs might be a good fit

Share
Key job responsibilities
You will lead efforts to build distributed training support into PyTorch and JAX using XLA, the Neuron compiler, and runtime stacks. You will optimize models to achieve peak performance and maximize efficiency on AWS custom silicon, including Trainium and Inferentia, as well as Trn2, Trn1, Inf1, and Inf2 servers. Strong software development skills, the ability to deep dive, work effectively within cross-functional teams, and a solid foundation in Machine Learning are critical for success in this role.Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSWork/Life Balance
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- Bachelor's degree in computer science or equivalent
- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Experience as a mentor, tech lead or leading an engineering team
- Experience in machine learning, data mining, information retrieval, statistics or natural language processing
- Master's degree in computer science or equivalent
- Experience in computer architecture
- Previous software engineering expertise with Pytorch/Jax/Tensorflow, Distributed libraries and Frameworks, End-to-end Model Training.
These jobs might be a good fit

Share
Key job responsibilities
• Drive a safety centric culture and ensure a safe workplace for builders and visitors to our sites.
• Oversee the performance of the data center's critical physical infrastructure. Ensure that all work performed is completed to the highest quality and without impact to customers.
• Leading a team of 24x7 engineering technicians with an emphasis on career growth.
• Driving improvement projects, often requiring reaching out to a variety of support teams, and drive them from conception to completion.
• Coordinate daily with third-party vendors ensuring adherence to contracted SLA’s.
• Effectively and efficiently manage the operations budget and expenditures
• Routinely operate as the after-hours on-call Data Center Facility Manager for their data centers in the region. This will include responding to any issues within the data centers and managing the investigation, mitigation, and recovery of the issue(s)
A day in the life
As the Facility Manager, your role demonstrates a strong commitment to prioritizing the development and well-being of team members, as well as fostering diversity and inclusion. You will oversee all facets of the data center's critical infrastructure with a focus on continuous availability and optimal performance, while upholding high-quality standards and minimizing any impact on internal and external customers. Additionally, you will play a crucial role in process optimization, staff management, setting performance metrics, and driving continuous improvement initiatives, all while ensuring a supportive and inclusive work environment.
- Experience in people management and team development
- Experience in engineering work, managing large-scale services
- Experience maintaining SLAs through the implementation of proactive issue detection and reporting
- Experience operating a mission-critical team or product
- High school or equivalent
- This role requires you to be a national of an EU member state.
- Bachelor's degree in Electrical Engineering, Mechanical Engineering, or a related field
- Knowledge of the electrical and mechanical systems involved in critical data center operations including systems such as feeders, transformers, generators, switchgear, UPS systems, ATS units, PDU units, chillers, pumps, air handling units, and CRAC units
- Experience in a management position with 5 or more direct reports
- Experience working in data centers with an emphasis on building and equipment operation
These jobs might be a good fit

Share
As the Software Development Manager, you will lead and mentor a high-performing team of software engineers while driving the development and maintenance of critical Neuron framework components. You'll drive cross-functional collaboration with compiler, runtime, and kernel development teams to ensure seamless integration of Neuron with major machine learning frameworks. You will also contribute technically by reviewing designs and implementing features.A crucial aspect of your role will be building and nurturing strategic relationships with open-source communities, particularly with JAX, OpenXLA, and PyTorch/XLA. You'll work closely with these communities to align framework development roadmaps with Neuron's strategic objectives, ensuring our customers have access to the latest ML framework innovations.
Key job responsibilities
* Responsible for the overall systems development life cycle
* Management and execution against project plans and delivery commitments
* Manage the day-to-day activities of the engineering team
* Management of resources, staffing, mentoring, and maintaining a best-of-class engineering team
* Report on status of development, quality, operations, and system performance to management
- 3+ years of engineering team management experience
- 7+ years of working directly within engineering teams experience
- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
- Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
- Experience partnering with product or program management teams
- Experience in communicating with users, other technical teams, and senior leadership to collect requirements, describe software product features, technical designs, and product strategy
- Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers to improve their skills, and make them more effective, product software engineers
These jobs might be a good fit

Share
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
Key job responsibilities
A day in the life
As you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:
Participate in design discussions, code review, and communicate with internal and external stakeholders.Work in a startup-like development environment, where you’re always working on the most important stuff.
- 5+ years of non-internship professional software development experience
- 5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.
- Experience programming with at least one software programming language
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Masters degree in computer science or equivalent
These jobs might be a good fit