

As a senior SDE in the pre-silicon team, you will be responsible for driving the pre-silicon hardware/software co-development for our machine learning chips.You will work with architecture, design and emulation teams to build new silicon functionality.You will write bare-metal software to verify the end-to-end functionality of the SoC and the functionality and performance of different subsystems in the SoC.
Work/Life Balance
Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
* 5+ YoE in software development
* Knowledge of HW/SW interfaces and computer architecture
* Proficiency in programming in C/C++, scripting in Bash/Python
* Proficiency in data structures and algorithms
* Knowledge in low level software such as firmware and device drivers
* Knowledge in SoC architecture
* Knowledge in IO(PCIE, AXI) , Memory(HBM, DDR), CPU architecture and Interconnects.
משרות נוספות שיכולות לעניין אותך

The Trainium Manufacturing, Quality and Reliability (MQR) Team is part of AWS Annapurna Labs focused on Machine Learning products that designs cutting AI platforms for the world’s largest Cloud Services provider. As a Senior Reliability Engineer you will engage with an experienced cross-disciplinary staff to conceive and design infrastructure technologies. You will work closely with an internal inter-disciplinary team, and outside partners to drive key aspects of product definition, execution and test in manufacturing. A successful candidate will be responsive, flexible and able to succeed within an open collaborative peer environment. You will:* Be responsible for the test validation of future technologies.
* Drive manufacturing process improvements to address reliability issues and concerns.
* Qualify manufacturing lines and mechanisms for mass production
* You will have a fundamental understanding of Reliability statistics/Reliability tests and/or solid understanding of computer systems to influence design for reliability.
* Lead identifying and validating product/component risks and work with design teams to mitigate them and define the test methodology and test coverage to assure product reliability.
* Deep-dive in technologies aligned with product roadmap.
* Provide technical leadership and mentor engineers.
* Perform Reliability prediction of failure mechanisms, products under development and products in the field.
* Working with multiple vendors and ODMs to standardize component manufacturing and reliability expectations.Key job responsibilities
* Responsible for defining reliability tests to be implemented during manufacturing
* Drive manufacturing process improvements to address reliability issues and concerns.
* Perform Reliability prediction of failure mechanisms, products under development and products in the field.
* Working with multiple vendors and ODMs to standardize component manufacturing and reliability expectations.
- Bachelor's or Master’s degree in Reliability Engineering, Physics or related field, or equivalent experience
- 7+ years of Reliability Engineering work experience with server compute platforms or on high-tech hardware

In this role you will be responsible for building and supporting a team which is critical in providing compute sanitization to the Neuron ML accelerators fleet. You will work closely with the hardware and software teams to ensure the right tools are available for identifying defects or faulty states of the hardware before the customer hits an issue. Neuron Compute Sanitizer Tools develops and maintains a pre-check and functional correctness checking suite and provides visibility at the fleet level to understand the trends of hardware/software sanitization.Key job responsibilities
* Build and develop a strong team of engineers that would deliver the pre-check suite.
* Work closely with the hardware and firmware design teams.
* Collect requirements from various other teams including training, inference and runtime.
* Collaborate with the runtime team to ensure timely release of the pre-check tools.
* Anticipate future needs based on the product roadmap and develop necessary tools to sanitize compute.
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Work/Life Balance
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
About AWS Utility Computing (UC):
About AWS
About AWS Neuron:
- 8+ years of engineering experience
- 5+ years of engineering team management experience
- 10+ years of planning, designing, developing and delivering consumer software experience
- Experience partnering with product or program management teams
- Experience managing multiple concurrent programs, projects and development teams in an Agile environment
- Experience designing and developing large scale, high-traffic applications
- Experience with ML hardware/Software

Technologies useful to this role include computer architecture, hardware description languages (HDLs), and embedded systems. Our team uses Verilog, C, C++, Lua, bash, Python and other similar languages. Although we use machine learning workloads to validate systems software, this team is focused on codeveloping reliable server software and hardware for customers to deploy their ML workloads at scale.Key job responsibilities- Develop CPLD and FPGA programs that implement power sequencing and manage various protocols, including PWM, I2C, and SPI
- Develop systems software, kernel drivers
- Define test and automation flows to validate firmware
- Evaluate and optimize firmware performance
- Build error detection and recovery mitigation systems at AWS scaleA day in the life
You will have the opportunity to develop server firmware in a highly cross-functional environment, working side by side with software and hardware teams to optimize customer experience. You will be responsible for building scalable designs that can be tested throughout the stages of product development including manufacturing and production. You will leverage automation, continuous integration, and fleet metrics to deploy and monitor your changes.Work/Life Balance
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3+ years of programming with at least one hardware description language (HDL) experience
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Experience in embedded development in C/C++
- Experience in RTL development in Verilog, VHDL, or SystemC

You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.Key job responsibilities
The successful candidate will be operationally responsible for several of Amazon.com's Data Centers during the day shift. Some high-level responsibilities include:
- Prioritize and assign trouble tickets to data center technicians and operators
- Routinely review ticket queue for large events and address accordingly
- Coordinate change management resources
- Guide, train and educate data staff on the best practices related to all service owner issues
- Manage and deliver mid-size projects
- Recommend, document, and oversee policies and procedures to meet industry best practices and to meet required SLAs- Provide weekly report to the data center manager
- Maintain the on-call schedule coordinating absence and vacations
- Recruit and train data technicians to ensure appropriate staffing levels
- Host weekly staff meetings
- Write and deliver performance reviews for staffA day in the life
*Why AWS*
*Diverse Experiences*
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.*Work/Life Balance*
*Inclusive Team Culture*
*Mentorship and Career Growth*
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
- 2+ years of engineering team management experience
- 4+ years of professional or military experience, or Bachelor's degree in computer science or equivalent
- Bachelor's degree
- 3+ years of networking and troubleshooting experience
- Experience with general troubleshooting/debugging of hardware, or experience in development in the last 3 years
- Experience with system management tools and client/server environments

Possessing a deep understanding of AWS products and services, as a Delivery Consultant you will be proficient in architecting complex, scalable, and secure solutions tailored to meet the specific needs of each customer. You’ll work closely with stakeholders to gather requirements, assess current infrastructure, and propose effective migration strategies to AWS. As trusted advisors to our customers, providing guidance on industry trends, emerging technologies, and innovative solutions, you will be responsible for leading the implementation process, ensuring adherence to best practices, optimizing performance, and managing risks throughout the project.
Key job responsibilities
As an experienced technology professional, you will be responsible for:- Providing technical guidance and implementation support throughout project delivery, with a focus on using AWS AI/ML services
- Collaborating with customer stakeholders to gather requirements and propose effective model training, building, and deployment strategies- Sharing knowledge within the organization through mentoring, training, and creating reusable artifacts
About the team
About AWS:
Diverse Experiences: AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job below, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Mentorship & Career Growth - We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- 8+ years of cloud based solution (AWS or equivalent), system, network and operating system experience
- 5+ years of experience hosting and deploying GenAI/ML solutions (e.g., for data pre-processing, training, deep learning, fine tuning, and inferences) or/and Data Science Experience
- 8+ years of coding, data querying languages (e.g. SQL), scripting languages (e.g. Python)

This position can be located in Austin, Seattle, or Arlington (DC).**Must be open to travel at least 30% including international**Key job responsibilities
- Develop solutions that make the best use of the AWS services like AWS EC2, EKS, ECS, SageMaker and other computing platform for GenAI practice.- Provide one-to-few and one-to-many training sessions to transfer knowledge to builder considering or already using AWS.- Build deep relationships with senior technical individuals within partners to enable them to be cloud advocates.- Be able to develop proof-of-concepts for solutions involving AWS services.
- Driving product integrations between partner products and AWS services
- Proving thought leadership in the form of publishing blog posts, public speaking, white papers and reference architecturesA day in the life
- Building and testing a Proof of Concept (PoC) or create a code sample.- Writing a blog post or white paper.
AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.Work/Life Balance
- 8+ years of specific technology domain areas (e.g. software development, cloud computing, systems engineering, infrastructure, security, networking, data & analytics) experience
- 3+ years of design, implementation, or consulting in applications and infrastructures experience
- 10+ years of IT development or implementation/consulting in the software or Internet industries experience
- Recent and demonstrable hands-on experience with AI/ML workloads.
- 5+ years of infrastructure architecture, database architecture and networking experience
- Knowledge of AWS services, market segments, customer base and industry verticals
- Experience working with end user or developer communities

As a senior SDE in the pre-silicon team, you will be responsible for driving the pre-silicon hardware/software co-development for our machine learning chips.You will work with architecture, design and emulation teams to build new silicon functionality.You will write bare-metal software to verify the end-to-end functionality of the SoC and the functionality and performance of different subsystems in the SoC.
Work/Life Balance
Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
* 5+ YoE in software development
* Knowledge of HW/SW interfaces and computer architecture
* Proficiency in programming in C/C++, scripting in Bash/Python
* Proficiency in data structures and algorithms
* Knowledge in low level software such as firmware and device drivers
* Knowledge in SoC architecture
* Knowledge in IO(PCIE, AXI) , Memory(HBM, DDR), CPU architecture and Interconnects.
משרות נוספות שיכולות לעניין אותך