

Share
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
Key job responsibilities
A day in the life
As you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:
Participate in design discussions, code review, and communicate with internal and external stakeholders.Work in a startup-like development environment, where you’re always working on the most important stuff.
- 5+ years of non-internship professional software development experience
- 5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.
- Experience programming with at least one software programming language
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Masters degree in computer science or equivalent
These jobs might be a good fit

Share
In-house designed SoCs (System on Chips) are the brains and brawn behind AWS’s Machine Learning Acceleration servers, TRN and INF. Our team builds functional models of these ML accelerator chips to speed up SoC verification and system software development. We’re looking for a Hardware Simulator SDE to join the team and develop new C++ models of the SoC, infrastructure, and tooling for our customers.
As part of the ML acceleration modeling team, you will:
- Develop and own SoC functional models end-to-end, including model architecture, integration with other model or infrastructure components, testing, and debug- Innovate on the tooling you provide to customers, making it easier for them to use our SoC models
- Drive model and modeling infrastructure performance improvements to help our models scale
- Develop software which can be maintained, improved upon, documented, tested, and reused
Annapurna Labs, our organization within AWS, designs and deploys some of the largest custom silicon in the world, with many subsystems that must all be modeled and tested with high quality. Our SoC model is a critical piece of software used in both our SoC development process and by our partner software teams. You’ll collaborate with many internal customers who depend on your models to be effective themselves, and you'll work closely with these teams to push the boundaries of how we're using modeling to build successful products.You will thrive in this role if you:
- Are interested in functional modeling or have background in modeling hardware like SoCs, ASICs, TPUs, GPUs, or CPUs
- Are comfortable using C++ with OOP principlesAlthough we are building ML SoC models, no machine learning background is needed for this role. You’ll be able to ramp up on ML as part of this role, and any ML knowledge that’s required can be learned on-the-job.
- Currently enrolled in a Bachelor’s degree program or higher Computer Science, Computer Engineering, Electrical Engineering, in these fields are considered with a graduation conferral date between December 2022 and September 2025
- Experience programming with C++ using OOP
- Familiarity with SoC, CPU, GPU, and/or ASIC architecture and micro-architecture
- Experience writing functional models of hardware, SoCs, ASICs, etc.
- Experience with the full software development life cycle, including coding standards, code reviews, source control management, build processes, and testing
- Familiarity with modern C++ (11, 14, etc.)
- Experience with PyTest or GoogleTest

Share
Key job responsibilities
You will lead efforts to build distributed training support into PyTorch and JAX using XLA, the Neuron compiler, and runtime stacks. You will optimize models to achieve peak performance and maximize efficiency on AWS custom silicon, including Trainium and Inferentia, as well as Trn2, Trn1, Inf1, and Inf2 servers. Strong software development skills, the ability to deep dive, work effectively within cross-functional teams, and a solid foundation in Machine Learning are critical for success in this role.Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSWork/Life Balance
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent

Share
The Product: AWS Machine Learning accelerators are at the forefront of AWS innovation. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in cloud. Trainium will deliver the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, Neuron Kernel Interface (NKI) compiler, and runtime that natively integrates into popular ML frameworks, such as PyTorch and TensorFlow.Neuron Kernel Interface (NKI) is a bare-metal language and compiler for directly programming NeuronDevices available on AWS Trn/Inf instances. You can use NKI to develop, optimize and run new operators directly on NeuronCores while making full use of available compute and memory resources.Learn more about Our History:
You have knowledge of resource management, scheduling, code generation, optimization, and instruction architectures including CPU, NPU, GPU and novel forms of compute.Explore the Product:
Work/Life Balance
Mentorship & Career Growth
- 5+ years of engineering team management experience
- 9+ years of working directly within engineering teams experience
- 4+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience
- Experience partnering with product or program management teams
- Understanding of compilers (resource management, instruction scheduling, code generation, and compute graph optimization)
- Strong software design fundamentals and excellent system-level coding skills
- M.S. or Ph.D. in Computer Science or related technical field

Share
Key job responsibilities
In this role you'll develop, design, maintain, deploy, monitor and support a very important component in the Nitro firmware, while enjoying every step of the journey.About the team
*Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
*Why AWS*Work/Life Balance*Inclusive Team Culture*Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent

Share
Key job responsibilities
- Develop and maintain integrations using RESTful services, SOAP, and database connections
- Develop endpoints in systems (e.g. NetSuite) that connect to AWS
- Architect robust error management and control systems
- Conduct code reviews and maintain high code quality standards
- Create comprehensive technical documentationTechnical Expertise:
- Deep knowledge of AWS services, including: Lambda, S3, Redshift, API Gateway, CloudWatch, EventBridge, SQS, SNS, B2B Data Interchange
- Experience building APIs and system integrations
- Excellent background in systems integration and data processing
- Experience in automating, deploying, and supporting large-scale infrastructure
- Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
- Experience with CI/CD pipelines build processes
- Experience with distributed systems at scalePursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Share
Custom SoCs (System on Chip) live at the heart of AWS Machine Learning servers. As a member of the Cloud-Scale Machine Learning Acceleration team you’ll be responsible for the design and optimization of hardware in our data centers including AWS Inferentia, Trainium Systems (our custom designed machine learning inference and training datacenter servers). Our success depends on our world-class server infrastructure; we’re handling massive scale and rapid integration of emergent technologies. We’re looking for an ASIC Physical Design Engineer to help us trail-blaze new technologies and architectures, while ensuring high design quality and making the right trade-offs.Key job responsibilities
- Work with RTL/logic designers to drive architectural feasibility studies, explore power-performance-area tradeoffs for physical design closure
- Drive IO/Core block physical implementation through synthesis, floor planning, bus / pin planning, place and route, power/clock distribution, congestion analysis, timing closure, IR drop analysis, physical verification, ECO and sign-off
- Develop physical design methodologies
- Evaluate 3rd party IP and provide recommendations
- BS + 8yrs or MS + 6yrs in EE/CS
- 6+ years in ASIC Physical Design from - RTL-to-GDSII in either 7nm, 14/16nm, 20nm, or 28nm
- Block Design using EDA tools (examples: Cadence, Mentor Graphics, Synopsys, or Others) including synthesis, equivalency verification, floor planning, bus / pin planning, place and route, power/clock distribution, congestion analysis, timing closure, IR drop analysis, physical verification, and ECO
- Deep understanding on sign-off activities (timing, ir/em, physical verification)
- Scripting experience with Tcl, Perl or Python
- Expertise using CAD tools (examples: Cadence, Mentor Graphics, Synopsys, or Others) develop flows for synthesis, formal verification, floor planning, bus / pin planning, place and route, power/clock distribution, congestion analysis, timing closure, IR drop analysis, physical verification, and ECO
- 4+ years in integrating IP and ability to specify and drive IP requirements in the physical domain.
- Thorough knowledge of device physics, custom/semi-custom implementation techniques
- Experience solving physical design challenges across various technologies such as DDR, PCIe, fabrics etc.
- Experience in extraction of design parameters, QOR metrics, and analyzing trends
- Ability to provide mentorship, guidance to junior engineers and be a very effective team player
- Meets/exceeds Amazon’s leadership principles requirements for this role
- Meets/exceeds Amazon’s functional/technical depth and complexity for this role

Share
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
Key job responsibilities
A day in the life
As you design and code solutions to help our team drive efficiencies in software architecture, you’ll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You’ll also:
Participate in design discussions, code review, and communicate with internal and external stakeholders.Work in a startup-like development environment, where you’re always working on the most important stuff.
- 5+ years of non-internship professional software development experience
- 5+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution.
- Experience programming with at least one software programming language
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Masters degree in computer science or equivalent
These jobs might be a good fit