Expoint - all jobs in one place

מציאת משרת הייטק בחברות הטובות ביותר מעולם לא הייתה קלה יותר

Limitless High-tech career opportunities - Expoint

EY DET-TT- SRE Senior-GDS04 
India, Tamil Nadu, Chennai 
278021555

29.08.2024

Senior Site Reliability Engineer - Senior level
Description

  • Site Reliability Engineering (SRE) is a modern way of delivering IT Solutions by imbibing Software engineering principles in Service Delivery to reduce IT Risk to business, improve business resilience, attain predictability & reliability, optimize cost of IT Infra and Ops
  • A Site Reliability Engineer typically has deep software engineering experience encompassing design, build, deploy and manage / maintain an IT solution ensuring resilience, reliability, and performance.
  • An SRE is a bridge between development and operations by applying a software engineering mindset to the development, deployment, and maintenance of applications to maximize system reliability & automation, while improving efficiencies by optimizing resources

Responsibilities

  • Defining SLA/SLO/SLI for a product / service
  • Engineering in resilient design and implementation practices into solutions as they go through the product life cycle
  • Engineering out manual effort (Toil) through the development of automated processes and services (e.g., Automated Management of Systems, CI/CD improvements)
  • Developing Observability Solutions to track, report, and measure SLA adherence
  • Help Optimize Cost of IT Infra & Operations - FinOps
  • Critical Situation management
  • SOP / Runbook automation, Toil reduction
  • Data Analytics & System trend analysis

Typical Skills and Background

  • 7+ years of experience in software product engineering principles, processes and systems
  • Hands-on experience in Java / J2EE, one of web server (Apache Tomcat or IBM HTTP Server), one of the application servers (Tomcat/WebSphere), and any major RDBMS like Oracle
  • Hands-on experience in at least one CI-CD (Azure DevOps, GitLab CI/CD, Jenkins) and IaC tools (Terraform, AWS CloudFormation, Ansible etc.)
  • Experience in at least one cloud technology (AWS/Azure/GCP etc. and Docker, Pivotal, Kubernetes, OpenShift etc.) and its reliability tools (Azure AppInsight, CloudWatch, Azure Monitor etc.)
  • Experience in Observability - APM tools (Dynatrace, AppDynamics etc.), metrics / log consolidation (Splunk) and ELK Stack
  • Defining NFRs and SLA/SLO/SLI agreement for a product / platform / services
  • Knowledge on queuing models used, thread pools, request servicing processes etc.
  • Knowledge in Web Services, SOA, ESB (DataPower), RESTFul
  • Knowledge of application design patterns, J2EE application architectures, Microservices, Spring boot & Cloud native architectures
  • Proficiency in Java runtimes, Core Java, Garbage collection, JVM parameters tuning
  • Experience in performance tuning on Application Servers (Tomcat/WAS)
  • Experience in trouble shooting Performance / Scalability / Availability issues
  • Experience in Thread dump, heap dump generation & analysis
  • Knowledge on Query tuning and database designs & models
  • Knowledge at least one automation scripting language like Python
  • Mastery in collaborative software development using Git, Jira, Confluence etc.
  • AI/ML & Data Analytics knowledge and experience is a desirable



EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.