Looking for a highly-skilled Senior Machine Learning Engineer, to lead the development and delivery of technologies to push the boundaries of efficient inference for open source Generative Artificial Intelligence (GenAI) models. You will have an enormous opportunity to make an impact on the design, architecture, and implementation of cutting-edge technologies used every day. This role offers the exciting chance to work in a highly technical domain at the boundary between fundamental AI research and production engineering such as Quantization, Speculative Decoding, and Long Context for inference efficiency.Key job responsibilities
* Create solutions that facilitate the usage and building of artificial intelligence workflows and optimize them for cost and latency.
* Collaborate with cross-functional teams of engineers and scientists to identify and solve complex problems in GenAI
• Design, prototype, and evaluate new inference engines and optimization techniques
• Participate in deep-dive analysis and profiling of production code. Optimize inference performance across various platforms
• Hold a high bar for technical excellence within the team and across the organization
About the team
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.Work/Life Balance*** Please continue to use the below tagline in all job postings as the statement has been approved by all stakeholders and aligns with Amazon's working culture.
- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- - Experience with inference frameworks such as vllm, PyTorch, TensorFlow, TensorRT etc.
- - Proficiency in performance optimization on GPU or Trainiums
- - Proficiency in kernel programming for accelerated hardware using programming models such as (but not limited to) CUDA
- - Experience with latency-sensitive optimizations and real-time inference
- - Knowledge of model optimization techniques
- - Strong communication skills and ability to work in a collaborative environment
- - Passion for solving complex problems and driving innovation in AI technology
- - (MS/Phd) in Mathematics or Electrical engineering, with sufficient coding expertise CUDA/kernals preferred
משרות נוספות שיכולות לעניין אותך