Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

JPMorgan Lead Site Reliability Engineer 
India, Karnataka, Bengaluru 
455875688

13.07.2024

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.


Job responsibilities

  • Demonstrates and champions site reliability culture and practices and exerts technical influence throughout your team
  • Leads initiatives to improve the reliability and stability of your team’s applications and platforms using data-driven analytics to improve service levels
  • Collaborates with team members to identify comprehensive service level indicators and stakeholders to establish reasonable service level objectives and error budgets with customers
  • Demonstrates a high level of technical expertise within one or more technical domains and proactively identifies and solves technology-related bottlenecks in your areas of expertise - CI/CD management and automation to achieve one-touch deployment across all application tiers
  • Writing specifications and documentation for application release management
  • Define application performance KPIs and create/manage the capacity framework
  • Infrastructure build, management, integration with core services and hygiene
  • Application setup, migration and maintenance in private/public cloud [AWS, Azure]
  • Docker Containers, automating container image creation process, build and deployment in container environment
  • Define application availability KPIs, setup monitoring frameworks and publish the uptime & SLAs

Required qualifications, capabilities, and skills

  • Formal training or certification on software engineering concepts and 10+ years applied experience.
  • Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices with the ability to implement these practices within an application or platform
  • Strong Linux/Unix fundamentals, good understanding of subsystems such as memory, storage, network.
  • Experience of Continuous Integration technologies, such as: Jules, Maven, Ant, Selenium, Cucumber, Mocks, JMeter, JUnit, etc. is expected.
  • Ability to understand the business services and map it to the reliability engineering design and review.
  • Support the technology and business services of the entire technology platforms from the scaling and performance perspective.
  • Manage the uptime of each of the micro services by building and implementation of the right monitoring and alerts.
  • Good understanding of object oriented programming, relational databases, NOSQL, caching systems, etc.
  • Strong problem management abilities by automating any repeatable jobs and working with the stakeholders to ensure the incidents do not repeat again.
Preferred qualifications, capabilities, and skills
  • Self-starter and a Team player able to work effectively among and across Tech, Business, and Ops teams.
  • Excellent verbal and written communication skills. Deep understanding of architectural concepts, issues and trends.
  • Ability to work independently and in a team & Proficient at researching innovative solutions for challenging technical problems.
  • Willingness to pick up and learn new technologies, frameworks and tools as directed.
  • Looking for someone who brings a lot of positive energy.