Share
Key job responsibilities
- Research and implement agentic AI frameworks to solve diverse visual AI challenges in the physical world
- Develop novel approaches to token compression and memory management for processing continuous visual streams- Design and optimize distributed training systems for large-scale visual AI models
- Build efficient inference pipelines that can process multiple visual streams within strict resource constraints
- Implement hybrid edge-cloud architectures for deploying visual reasoning systems at scaleA day in the life
As an MLE on our team, you'll be at the forefront of developing next-generation visual reasoning systems. You'll tackle challenges like designing efficient architectures for processing visual information, implementing sophisticated memory management for long-term reasoning, and developing novel approaches to maintain AI capabilities while optimizing for real-world constraints. You'll collaborate closely with Applied Scientists to advance the state-of-the-art while ensuring our solutions meet rigorous requirements for accuracy, latency, and cost.About the team
Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.Work/Life Balance
- 3+ years of non-internship professional software development experience, including coding standards, code reviews, source control management, build processes, testing, and operations.
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience.
- Background in computer vision, multimodal foundation model, or related fields.
- Proficiency in Python and distributed systems.
- Experience with deep learning frameworks (PyTorch, TensorFlow, etc.).
- Experience with real-time inference optimization, distributed training systems, GPU utilization and memory optimization techniques.
- Familiar with multi-modal foundation models, pre-training and post-training techniques.
- Master's or PhD degree in Computer Science, Machine Learning, or related field.
- Experience designing efficient data preprocessing pipelines, building and scaling multi-modal model architectures, and conducting robust evaluation at scale.
- Deep understanding of visual reasoning architectures and agentic AI systems.
- Experience with edge computing and model optimization for resource-constrained environments
- Track record of contributions to visual AI or multi-modal learning systems.
- Expertise in large-scale video processing and analysis.
These jobs might be a good fit