Expoint - all jobs in one place

The point where experts and best companies meet

Limitless High-tech career opportunities - Expoint

JPMorgan Software Engineer III - ML Ops & Python 
India, Karnataka, Bengaluru 
94061945

08.04.2025

Job responsibilities

  • Set direction, development, and implementation of ML and GenAI promoten solutions
  • Develop codes and automations to provision and manage cloud infrastructure and services required for model serving and inferencing.
  • Create pipelines for model and pipeline deployment and execute them for CI/CD process.
  • Be responsible for designing, provisioning, and executing monitoring solutions for infrastructure, platform, services and model health.
  • Deploy and manage Machine Learning platform and platform features in production.
  • Be responsible for stability, reliability, security and production run books of the machine learning solution in production.
  • Create, deploy and manage cost management and optimization solutions for machine learning solutions in production
  • Be responsible for registering the models artifacts, maintaining code coverage, implementing functional and non-functional tests in the pipeline and maintaining CI/CD frameworks.
  • Collaborate with Machine Learning engineers, product managers, key business stakeholders, engineering, and platform partners to deploy projects that deliver cutting-edge machine learning-driven digital solutions.
  • Communicate and collaborate with Platform and Engineering partners to bring in the latest advancements in order to improve the scale, consistency, reliability and trustworthiness of the ML solutions

Required qualifications, capabilities, and skills

  • Formal training or certification on software engineering concepts and 3+ years applied experience
  • Proven experience in deploying AI/ML applications in a production environment, with skills in deploying models on AWS platforms such as SageMaker or Bedrock.
  • Proven experience with MLOps practices, encompassing the full cycle from design, experimentation, deployment, to monitoring and maintenance of machine learning models.
  • Demonstrated expertise in deploying solutions involving machine learning frameworks: Tensorflow, Pytorch, pyG, Keras and Scikit-Learn.
  • Expert skill in system monitoring tools like Apica, Dynatrace, Grafana etc.
  • Expert skill in CI/CD tools like Jenkins, Spinnaker, Git, Bitbucket etc.
  • Expert in Infrastructure as a code frameworks i.e. Terraform, EaC (Environment as Code)
  • Proficiency in programming languages such as Python or Java etc.
  • Working proficiency in one of the end to end Machine Learning framework or tool i.e. MLFlow, Kubeflow, sagemaker studio, Databricks etc.
  • Expert knowledge of one of the cloud computing platforms preferred: Amazon Web Services (AWS), containerization technologies (Docker, Kubernetes, Amazon EKS, ECS)

Preferred qualifications, capabilities, and skills

  • Experience with large scale training, validation and testing
  • Good understanding of AI/ML algorithms and techniques, including deep learning, reinforcement learning, and natural language processing (NLP).
  • Experience with Kubernetes based platform for Training or Inferencing.
  • Experience of building and managing large scale feature stores.
  • Proficiency in writing comprehensive test cases, with a strong emphasis on using testing frameworks such as pytest to ensure code quality and reliability.
  • Understanding of finance or investment banking businesses is an added advantage.