You will be joining our advanced optimization engineering team, driving innovation in AI-powered compute optimization. This role will play a pivotal part in building scalable services and intelligent automation for optimizing large-scale data workloads running on cloud platforms like AWS EKS, AWS EMR, Databricks and Snowflake.
You will help shape the architecture, develop reusable SDKs, transform datasets and support AI/ML driven decision systems that improve cost efficiency and performance across heterogeneous environments.
Job responsibilities
- Execute creative software solutions, including design, development, and technical troubleshooting, with the ability to think beyond conventional approaches to build solutions or resolve technical problems
- Develop secure, high-quality production code, and review and debug code written by others
- Identify opportunities to eliminate or automate the remediation of recurring issues to enhance the overall operational stability of software applications and systems
- Lead evaluation sessions with external vendors, startups, and internal teams to drive outcomes-oriented assessments of architectural designs, technical credentials, and their applicability within existing systems and information architecture
- Lead communities of practice across Software Engineering to promote awareness and adoption of new and leading-edge technologies
- Contribute to a team culture of diversity, equity, inclusion, and respect
- Develop and deploy cloud infrastructure platforms that are secure, scalable, and optimized for AI and machine learning workloads
- Collaborate with AI teams to understand computational needs and translate these into infrastructure requirements
- Monitor, manage, and optimize cloud resources to maximize performance and minimize costs
- Design and implement continuous integration and delivery pipelines for machine learning workloads
- Develop automation scripts and infrastructure as code to streamline deployment and management tasks
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Hands-on practical experience in system design, application development, testing, and operational stability
- Advanced proficiency in one or more programming languages such as Python and/or Java
- Experience with AI/ML model integration, prompt engineering, or LLM APIs (OpenAI, Bedrock)
- Experience with AWS services (S3, EKS, Step Functions, Lambdas, RDS, API GW)
- Strong understanding of containerization and orchestration using Docker and Kubernetes
- Familiarity with data processing frameworks such as Apache Spark or Flink
- Experience building RESTful APIs and working with event-driven or serverless architectures
- Proficiency with Git, CI/CD pipelines, Terraform and automated testing frameworks
- Demonstrated knowledge of software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
Preferred qualifications, capabilities, and skills
- Exposure to Snowflake, Databricks, EMR, or related data platforms
- Familiarity with metrics processing tools (CloudWatch, Dynatrace, Datadog)