Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

JPMorgan SRE III 
India, Telangana, Hyderabad 
791673058

11.03.2025

Job responsibilities

  • Design and implement solutions to enhance the reliability and scalability of platforms and applications to accommodate rapidly growing demands.
  • Analyze defects, propose improvements, and drive efficiencies in systems and processes.
  • Optimize the performance and utilization of AI ML platform and infrastructure.
  • Develop observability, security, and finops tools and orchestration.
  • Author and improve the quality of technical engineering documentation.
  • Debug and solve issues in a production environment.
  • Participate in on-call rotations and escalation workflows.
  • Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate
  • Collaborates with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
  • Collaborates with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications
  • Implements infrastructure, configuration, and network as code for the applications and platforms in your remi

Required qualifications, capabilities, and skills

  • Formal training or certification on Site Reliability Engineering concepts and 3+ years applied experience
  • Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform
  • Formal training or certification on Site Reliability Engineering concepts and 3+ years applied experience
  • Expertise in programming with Python and cutting-edge software engineering practices.
  • Coding skills in any of the programming languages like Python, Java, PHP, Shell Scripting, Powershell Scripting
  • Experience in designing and implementing large-scale distributed systems and cloud-native architecture.
  • Experience with developing on Cloud, especially AWS, and knowledge in Infrastructure as Code tools such as Terraform
  • Ability to identify new technologies and relevant solutions to ensure design constraints are met by the software team
  • Ability to initiate and implement ideas to solve business problems
Preferred qualifications, capabilities, and skills
  • Prior experience working in AI, ML, or Data engineering.
  • Systematic problem-solving and troubleshooting skills in a complex system.
  • Excellent communication skills working with stakeholders and domain experts across the company to design solutions to user problems.
  • Self-disciplined, self-managed, self-motivated with a strong sense of ownership, urgency, and