Expoint - all jobs in one place

Finding the best job has never been easier

Limitless High-tech career opportunities - Expoint

Cyberark Senior Site Reliability Engineer 
India 
572435742

21.11.2024
Job Description

You will collaborate closely with development, operations, and other teams to implement and maintain efficient and resilient systems.

Responsibilities:

  • Infrastructure Automation: Developing, deploying, and overseeing Infrastructure as Code (IaC) solutions using tools such asTerraformandAnsibleto automate the provisioning, configuration, and deployment processes.
  • Cloud Platform Expertise: Deep understanding of AWS cloud services, including EC2, S3, VPC, RDS, EKS, ECS, CF and more. Experience with serverless architecture and AWS Lambda functions is a plus.
  • Containerization and Orchestration: Proficiency in containerization technologies (Docker) and orchestration platforms (Kubernetes) with deploying applications using tools like K8s and Helm.
  • CI/CD Pipelines: Build and maintain robust CI/CD pipelines using tools like Jenkins.
  • Monitoring and Alerting: Implement comprehensive monitoring and alerting solutions using tools like ELK, Datadog, CloudWatch, Grafana to proactively identify and resolve issues.
  • Incident Management: Drive incident response processes, troubleshoot complex issues, and perform Root Cause analysis (RCA) to prevent future occurrences (CAPA).
  • Performance Tuning : Continuously optimize system performance, identify bottlenecks, and implement strategies to improve scalability and efficiency.
  • Cost Optimization: Identify and implement strategies to reduce cloud costs while maintaining performance and reliability.
  • Security Best Practices: Adhere to security best practices and implement measures to protect infrastructure and data from vulnerabilities and threats.
  • Collaboration and Communication: Work effectively with cross-functional teams to understand business requirements and provide technical guidance.
  • SOP Documentation: Create and maintain documentation for infrastructure, processes, and incident management protocols.
Qualifications
  • 7+ years of experience as a DevOps engineer or Site Reliability Engineer
  • B.Tech computer
  • Strong proficiency in AWS cloud services like EC2, S3, VPC, RDS, EKS, ECS, CF and more. AWS Certification helps.
  • 3+ years of experience with serverless architectures using AWS Lambda.
  • Strong scripting skills (Python, PowerShell, CDK, Shell scripting).
  • Knowledge of CDK (Cloud Development Kit) for infrastructure as code.
  • Experience with infrastructure as code tools (Terraform, Ansible) and AWX Tower for Ansible automation.
  • Knowledge of containerization (Docker) and orchestration platforms (Kubernetes).
  • Expertise in CI/CD pipelines and automation tools (Jenkins, GitHub).
  • Exposure to monitoring and alerting tools (CloudWatch, Datadog, ELK, Grafana, NewRelic).
  • Documenting SOP and RCAs.
  • Understanding of security best practices and compliance standards. Security Certification is a plus.